Predicting the prevalence of lung cancer using feature transformation techniques

Zunaira Munawar, Fahad Ahmad, Saad Awadh Alanazi, Kottakkaran Sooppy Nisar, Madiha Khalid, Muhammad Anwar, Kashif Murtaza

Research output: Contribution to journalArticlepeer-review

13 Downloads (Pure)


Healthcare sector is one of the most important sectors of any country as a big part of the country’s economy is associated with it. The research is about to contribute to the health sector by minimizing the expenses of a lung cancer diagnosis. The study tries to devise an efficient method for initial screening of the patients with symptoms through their demographic and clinical data. The study seeks appropriate feature transformation techniques from dimensionality reduction techniques in combination with an apposite regression model that can perform this task robustly using the lung cancer dataset for early carcinoma diagnosis. To equip the health sector with state-of-the-art technology and for the betterment of humankind, the most beneficial tool of today is machine learning. Lungs play a vivacious role in the human body, oxygen is circulated in the body, and the air is taken from the atmosphere and sends it into the bloodstream. Several people decease every year because of lung carcinoma as lung cancer is normally identified at the latter phase due to lack of awareness. It stays unidentified because people have pneumonia often, which converts into lung cancer later. The projected research seeks to enable health professionals to the rationalization of primary diagnosis and treatment of lung carcinoma in developing countries. The proposed methodology has selected the optimized combination of regression-based machine learning technique and feature transformation technique for the available patterns based on demographic and clinical features of lung cancer patients. Based on the gathered results during training (RMSE = 0.1324, R2-Score = 0.7428) and testing (RMSE = 0.1273, R2-Score = 0.7405) it can be concluded that the fast independent component analysis and elastic net regression technique provided optimized results and outperform the other aspirant techniques.
Original languageEnglish
Pages (from-to)109-120
Number of pages12
JournalEgyptian Informatics Journal
Issue number4
Publication statusPublished - 13 Dec 2022


  • Lung cancer
  • Carcinoma
  • Computerized tomography scan
  • Positron emission tomography scan
  • Machine learning
  • Dimensionality reduction
  • Feature transformation
  • Regression model

Cite this