Hiding in plain sight. A qualitative, text-based method for the analysis of websites
Keywords:
tools, methodologies, qualitative data, websites, discourse analysisAbstract
This contribution develops a novel qualitative, text-based method for the analysis of websites. The texts of websites represent a huge resource for research that is rarely used in international development and is only very sparsely used in other fields. This method aims to make the most of the fact that websites provide a huge amount of ‘authentic’, unique, up-to-date, peer reviewed, topical data. Some of the challenges and opportunities of using the texts of websites for textual analysis in research are outlined, followed by a draft protocol for documenting individual pages which overcomes many of these challenges, making websites more accessible for qualitative research and, in particular, for critical discourse analysis. This method is based on a previous article which piloted a method to anaylse the text of websites using critical discourse analysis (Cummings et al, 2025a). This method can also be employed more widely as a way to document websites which might be under threat of neglect and destruction.
References
de Bernardi, C. (2019). Authenticity as a compromise: a critical discourse analysis of Sámi tourism websites. Journal of Heritage Tourism, 14(3), 249–262. https://doi.org/10.1080/1743873X.2018.1527844
Bogers, M., Biermann, F., Kalfagianni, A., & Kim, R. E. (2022). Sustainable Development Goals fail to advance policy integration: A large-n text analysis of 159 international organizations. Environmental Science and Policy, 138, 134–145. https://doi.org/10.1016/j.envsci.2022.10.002.
Cardillo, A. (2025) How Many Websites Are On The Internet? (2025). Exploding Topics, 28 May 2025.
https://explodingtopics.com/blog/how-many-websites-on-the-internet
Carneiro, L., & Johnson, M., (2014). Quantitative and qualitative visual content analysis in the study of websites. In Sage Research Methods Cases Part 1. SAGE Publications, Ltd., https://doi.org/10.4135/978144627305013517800
Chapekis, A., Bestvater, S., Remy, E., & Rivero, G. (2024). When Online Content Disappears: 38% of webpages that existed in 2013 are no longer accessible a decade later. Pew Research Center Report.
Cummings, S.J.R., Munthali, N., & T. Sittoni (2025a) Epistemic justice as a ‘new normal’? Interrogating the contributions of communities of practice to decolonization of knowledge. Sustainable Development 33(3): 3228-3245.
Cummings, S.J.R., White, N. & Boyes, B. (2025b) USAID and the new burning of the books in digital and ideological epistemicide. A call to action. RealKM Magazine 27 February. https://realkm.com/2025/02/27/usaid-and-the-new-burning-of-the-books-in-digital-and-ideological-epistemicide-a-call-to-action/.
Cummings, S., De Haan, L., & Seferiadis, A. A. (2020). How to use critical discourse analysis for policy analysis: a guideline for policymakers and other professionals. Knowledge Management for Development Journal, 15(1), 99-108.
Digital Preservation Coalition (2015). Digital Preservation Handbook, Second Edition.
Dillon-Shallard, D. (undated). Virtual presence, global impact: the indispensable benefits of a website for research institutes. Butterfly blog. https://butterfly.com.au/blog/website-for-research-institutes/
Fairclough. N. (2012a) The dialectics of discourse. Unpublished paper.
https://www.sfu.ca/cmns/courses/2012/801/1-readings/Fairclough%20Dialectics%20of%20Discourse%20Analysis.pdf (Accessed 11 May 2016)
Fairclough, N. (2012b) Critical Discourse Analysis. In: The Routledge Handbook of Discourse Analysis, edited by James Paul Gee and Michael Handford, 9-21. Abingdon and New York: Routledge.
Fairclough, N. (2013). Critical discourse analysis: The critical study of language. Abingdon: Routledge.
Fernández-Vázquez, J. S. (2021). Selling organic candy: multimodal critical discourse analysis of commercial websites. British Food Journal, 123(10), 3277-3292.
Fletcher, W. H. (2004). Making the Web More Useful as a Source for Linguistic Corpora. Language and Computers, 52, 191–206.
Kim Technologies. (2024). Is your data hiding in plain sight? https://blog.kimdocument.com/blog/is-your-data-hiding-in-plain-sight
Mautner, G. (2005). Time to get wired: Using web-based corpora in critical discourse analysis. Discourse & Society, 16(6), 809-828.
Pauwels, L. (2012). A multimodal framework for analyzing websites as cultural expressions. Journal of Computer-Mediated Communication, 17, 247–265.
doi: http://dx.doi.org/10.1111/j.1083-6101.2012.01572
Powell, W.W, Horvath, A., & Brandtner, C. (2016) Click and mortar: Organizations on the web. Research in Organizational Behavior 36: 101-120. https://doi.org/10.1016/j.riob.2016.07.001.
Sanz, R. & Hovell, J. (2021). Knowledge retention framework and maturity model: improving an organization or team’s capability to retain unique and critical knowledge. Knowledge Management for Development Journal 16(1): 8-27.
Seferiadis, A. A., de Haan, L., & Cummings, S. (2021). Feminist Critical Discourse Analysis of Ecopreneurship as an Instrument for Sustainable Development Grand Narratives and Local Stories. In Environmental Sustainability and Development in Organizations (pp. 1-17). CRC Press.
Shubladze, S. (2023) How To Make Use Of The New Gold: Data. Forbes, 27 March 2023.
https://www.forbes.com/councils/forbestechcouncil/2023/03/27/how-to-make-use-of-the-new-gold-data/
Tambe, N., & A. Jain (2024) Top website statistics for 2024. Forbes, 28 June 2024.
https://www.forbes.com/advisor/in/business/software/website-statistics/?via=hemin74
UN. (2015). Transforming our world: the 2030 Agenda for Sustainable Development. https://sustainabledevelopment.un.org/post2015/transformingourworld/publication (Accessed 15 January 2019)
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Sarah Cummings, Bruce Boyes, Nyamwaya Munthali, Rocio Sanz

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
The copyright of the articles published in this journal remains the property of the authors. For liability reasons, the title belongs to the Foundation for the Support of the Knowledge Management for Development Journal. The journal is published under a Creative Commons Attribution Non-Commercial Share Alike License. This journal is currently an open access journal as it has a funding model that does not charge readers or their institutions for access. From the BOAI definition [1] of "open access", we support the rights of users to "read, download, copy, distribute, print, search, or link to the full texts of these articles." However, some of the content (2009-2012) is only available on the Taylor and Francis website. Within the next few months, this issue too will become available on the OJS. [1] http://www.earlham.edu/~peters/fos/boaifaq.htm#openaccess