PRISMA systematic review: The application of natural language processing (NLP) to identify greenwashing in sustainability reports within the oil and gas industry

Authors

  • Firli Amaliyah Department of Accounting, Faculty of Economics and Business, Universitas Gadjah Mada, Yogyakarta City, Special Region of Yogyakarta, 55281, Indonesia
  • Athaya Harmana Putri Department of Electrical Engineering and Information Technology, Faculty of Engineering, Universitas Gadjah Mada, Yogyakarta City, Special Region of Yogyakarta, 55281, Indonesia
  • Naajwaa Putri Andyna Department of Accounting, Faculty of Economics and Business, Universitas Gadjah Mada, Yogyakarta City, Special Region of Yogyakarta, 55281, Indonesia
  • Zaky Khalif Amri Department of Accounting, Faculty of Economics and Business, Universitas Gadjah Mada, Yogyakarta City, Special Region of Yogyakarta, 55281, Indonesia

DOI:

https://doi.org/10.61511/jimese.v3i1.2025.2004

Keywords:

greenwashing, natural language processing, oil and gas industry

Abstract

Background: Greenwashing refers to misleading sustainability claims not backed by real actions, commonly seen in the oil and gas industry due to its dependence on fossil fuels. While companies may publicly commit to sustainability, their investments often contradict these claims, obstructing global renewable energy efforts. This mismatch between statements and actions misleads stakeholders and complicates audit processes. As demands for transparency grow, there is a pressing need for systematic tools to detect greenwashing. Prior research highlights that the narrative format of sustainability reports makes manual detection difficult, underscoring the need for technology-based solutions. Methods: This study aims to examine the application of Natural Language Processing (NLP), particularly the N-Gram model, in identifying indications of greenwashing in the oil and gas industry. The research uses a qualitative approach with a Systematic Literature Review (SLR) method and applies the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) framework. Findings: The N-Gram model aids in feature extraction by converting raw text from sustainability reports into structured representations and detecting linguistic patterns commonly found in overstated sustainability claims. When combined with classification methods like Support Vector Machine (SVM), it improves the accuracy of greenwashing detection. Key findings show that NLP can support auditors in assessing greenwashing risks and improving the efficiency of sustainability audits. Moreover, the integration of this technology promotes greater transparency in corporate disclosures. Conclusion: The application of the N-Gram model in the NLP context is effective in detecting greenwashing practices that were previously difficult to identify manually. Novelty/Originality of this article: This study offers novelty through the application of the N-Gram NLP model within the oil and gas industry context, which has been rarely explored in previous research. The practical implications of this study open opportunities for cross-sectoral implementation and the development of data-driven greenwashing identification standards in the future.

References

Agency theory - an overview. (2003). ScienceDirect Topics. https://www.sciencedirect.com/topics/economics-econometrics-and-finance/agency-theory

Arianto, W. R., Abrar, A., Umar, N., & Jarin, A. (2024, December). Comparative Study of Word2Vec, FastText, and Glove Embeddings for Synonym Identification in Bugis Language. In 2024 Beyond Technology Summit on Informatics International Conference (BTS-I2C) (pp. 555-560). IEEE. https://doi.org/10.1109/BTS-I2C63534.2024.10942212

Bernow, S., Godsall, J., Klempner, B., & Merten, C. (2019). More than values: The value-based sustainability reporting that investors want. McKinsey and Company, 7. https://www.mckinsey.de/~/media/McKinsey/Business%20Functions/Sustainability/Our%20Insights/More%20than%20values%20The%20value%20based%20sustainability%20reporting%20that%20investors%20want/More%20than%20values-VF.pdf

Carrington, D. (2023). Oil industry has sought to block state backing for green tech since 1960s. The Guardian. https://www.theguardian.com/environment/2023/nov/30/oil-industry-has-sought-to-block-state-backing-for-green-tech-since-1960s

ClientEarth. (2019). BP faces OECD complaint over misleading advertising on climate. https://www.clientearth.org

Deloitte. (2023). The impact of climate risk on the financial statements. https://www2.deloitte.com/global/en/pages/audit/articles/impact-of-climate-risk-on-financial-statements.html

Eisenhardt, K. M. (1989). Agency theory: An assessment and review. Academy of Management Review, 14(1), 57–74. https://doi.org/10.5465/amr.1989.4279003

EY. (2024). Global Institutional Investor Survey 2024. https://www.ey.com/en_uk/assurance/how-investors-are-seeking-clarity-on-climate-risk

Global Witness. (2022). Shell’s climate claims fall short: Real renewable energy investment remains low. https://www.globalwitness.org

Global Witness. (2023). Shell faces groundbreaking complaint for misleading US authorities and investors on its energy transition efforts. https://globalwitness.org/en/campaigns/fossil-gas/shell-faces-groundbreaking-complaint-misleading-us-authorities-and-investors-its-energy-transition-efforts/

House Oversight Committee. (2022). Fossil fuel industry misinformation and climate change: Final report. United States House of Representatives. https://oversight.house.gov

IEA. (2023). The oil and gas industry in net zero transitions. https://www.iea.org/reports/the-oil-and-gas-industry-in-net-zero-transitions/oil-and-gas-in-net-zero-transitions

IEA. (2023). The oil and gas industry in net zero transitions. IEA, Paris. https://www.iea.org/reports/the-oil-and-gas-industry-in-net-zero-transitions/oil-and-gas-in-net-zero-transitions

IEA. (2023). World Energy Investment 2023. https://www.iea.org/reports/world-energy-investment-2023

IEA. (2025). Global EV Outlook 2025. https://www.iea.org/reports/global-ev-outlook-2025

IEA. (2025). Global EV outlook 2025. IEA, Paris. https://www.iea.org/reports/global-ev-outlook-2025

InfluenceMap. (2023). Net zero greenwash: The gap between corporate claims and actions. https://influencemap.org

International Auditing and Assurance Standards Board. (2023). IFIAR letter on ED ISSA 5000. https://www.iaasb.org/sites/default/files/2023-11/IFIAR%20Letter%20on%20ED%20ISSA%205000.pdf

Kokina, J., Pachamanova, D., & Corbett, A. (2025). AI-augmented audit: Integrating NLP into sustainability reporting assurance. International Journal of Accounting Information Systems, 48, 100734. https://doi.org/10.1016/j.accinf.2025.100734

KPMG. (2024). The challenge of greenwashing: Disclosure, trust and regulation. https://assets.kpmg.com/content/dam/kpmg/cy/pdf/2024/the-challenge-of-greenwashing-report.pdf

Larson, E. (2020). Net-zero America: Potential pathways, infrastructure, and impacts. Princeton University. https://netzeroamerica.princeton.edu/img/Princeton%20NZA%20FINAL%20REPORT%20SUMMARY%20(29Oct2021).pdf

Liberati, A., Altman, D. G., Tetzlaff, J., Mulrow, C., Gøtzsche, P. C., Ioannidis, J. P. A., ... & Moher, D. (2009). The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: Explanation and elaboration. BMJ, 339, b2700. https://doi.org/10.1136/bmj.b2700

Moher, D., Liberati, A., Tetzlaff, J., Altman, D. G., & The PRISMA Group. (2009). Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. PLOS Medicine, 6(7), e1000097. https://doi.org/10.1371/journal.pmed.1000097

Nguyen, T. H., & Grishman, R. (2015). Relation extraction: Perspective from convolutional neural networks. In 1st Workshop on Vector Space Modeling for Natural Language Processing, VS 2015 at the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2015 (pp. 39–48). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/w15-1506

Pant, V. K., Sharma, R., & Kundu, S. (2024). An overview of stemming and lemmatization techniques. Advances in Networks, Intelligence and Computing, 308-321.

Rousseau, D. M. (2006). Towards a methodology for developing evidence-informed management knowledge by means of systematic review. British Journal of Management, 17(4), 391–410. https://doi.org/10.1111/j.1467-8551.2006.00502.x

Saxena, S. (2024). Using Natural Language Processing For Detecting Greenwashing Indicators and Constructing Impact-Focused Index Portfolio. SSRN 5024113. https://dx.doi.org/10.2139/ssrn.5024113

Shell plc. (2021). Investments and returns – Shell Energy Transition Progress Report 2021. https://reports.shell.com/energy-transition-progress-report/2021/financial-framework/investments-and-returns.html

Spence, M. (1973). Job market signaling. Quarterly Journal of Economics, 87(3), 355–374. https://doi.org/10.2307/1882010

Statista. (2023). Hard-to-abate sectors: Global GHG emissions shares. https://www.statista.com/statistics/1338481/global-ghg-emissions-share-hard-to-abate-sectors/

Supran, G., & Oreskes, N. (2021). Rhetoric and frame analysis of ExxonMobil's climate change communications. One Earth, 4(5), 696-719. https://doi.org/10.1016/j.oneear.2021.04.014

Tranfield, D., Denyer, D., & Smart, P. (2003). Towards a methodology for developing evidence‐informed management knowledge by means of systematic review. British Journal of Management, 14(3), 207–222. https://doi.org/10.1111/1467-8551.00375

UNEP. (2023). The Production Gap Report 2023: Phasing down or phasing up?. Stockholm Environment Institute. https://productiongap.org/wp-content/uploads/2023/11/PGR2023_web_rev.pdf

Wang, W., & Zhi, Q. (2022). Green finance and environmental sustainability: A systematic review and future research avenues. Environmental Science and Pollution Research, 29(43), 65423–65443. https://doi.org/10.1007/s11356-022-20138-z

Wang, Y. (2024). Enhancing ESG information disclosure analysis with natural language processing (NLP) and text generation techniques. Journal of Cleaner Production, 442, 143763. https://doi.org/10.1016/j.jclepro.2024.143763

Zhao, D., Chen, X., & Chen, Y. (2024). Named entity recognition for Chinese texts on marine coral reef ecosystems based on the BERT-BiGRU-Att-CRF model. Applied Sciences, 14(13), 5743. https://doi.org/10.3390/app14135743

Published

2025-07-31

How to Cite

Amaliyah, F., Putri, A. H., Andyna, N. P., & Amri, Z. K. (2025). PRISMA systematic review: The application of natural language processing (NLP) to identify greenwashing in sustainability reports within the oil and gas industry. Journal of Innovation Materials, Energy, and Sustainable Engineering, 3(1). https://doi.org/10.61511/jimese.v3i1.2025.2004

Issue

Section

Articles

Citation Check