File(s) under permanent embargo
Comparison of three statistical classification techniques for maser identification
journal contribution
posted on 2023-05-18, 18:43 authored by Manning, EM, Barbara HollandBarbara Holland, Simon EllingsenSimon Ellingsen, Breen, SL, Chen, X, Melissa HumphriesWe applied three statistical classification techniques - linear discriminant analysis (LDA), logistic regression, and random forests - to three astronomical datasets associated with searches for interstellar masers. We compared the performance of these methods in identifying whether specific mid-infrared or millimetre continuum sources are likely to have associated interstellar masers. We also discuss the interpretability of the results of each classification technique. Non-parametric methods have the potential to make accurate predictions when there are complex relationships between critical parameters. We found that for the small datasets the parametric methods logistic regression and LDA performed best, for the largest dataset the non-parametric method of random forests performed with comparable accuracy to parametric techniques, rather than any significant improvement. This suggests that at least for the specific examples investigated here accuracy of the predictions obtained is not being limited by the use of parametric models. We also found that for LDA, transformation of the data to match a normal distribution led to a significant improvement in accuracy. The different classification techniques had significant overlap in their predictions; further astronomical observations will enable the accuracy of these predictions to be tested.
History
Publication title
Publications of the Astronomical Society of AustraliaVolume
33Article number
e015Number
e015Pagination
1-30ISSN
1448-6083Department/School
School of Natural SciencesPublisher
Cambridge University PressPlace of publication
United KingdomRights statement
Copyright Astronomical Society of Australia 2016Repository Status
- Restricted