Web scraping using R

Alex Bradley, Richard James

Research output: Contribution to journalArticlepeer-review

1764 Downloads (Pure)

Abstract

The ubiquitous use of the Internet in daily life means that there are now large reservoirs of data that can provide fresh insights into human behavior. One of the key barriers preventing more researchers from utilizing online data is that they do not have the skills to access the data. This Tutorial addresses this gap by providing a practical guide to scraping online data using the popular statistical language R. Web scraping is the process of automatically collecting information from websites. Such information can take the form of numbers, text, images, or videos. This Tutorial shows readers how to download web pages, extract information from those pages, store the extracted information, and do so across multiple pages of a website. A website has been created to assist readers in learning how to web-scrape. This website contains a series of examples that illustrate how to scrape a single web page and how to scrape multiple web pages. The examples are accompanied by videos describing the processes involved and by exercises to help readers increase their knowledge and practice their skills. Example R scripts have been made available at the Open Science Framework.
Original languageEnglish
Pages (from-to)264-270
Number of pages7
Journal Advances in Methods and Practices in Psychological Science
Volume2
Issue number3
Early online date30 Jul 2019
DOIs
Publication statusPublished - 1 Sept 2019

Keywords

  • web scraping
  • web crawling
  • reverse engineering
  • big data
  • open science
  • open materials

Fingerprint

Dive into the research topics of 'Web scraping using R'. Together they form a unique fingerprint.

Cite this