Skip to main content

freeCodeCamp Scrapy Beginners Course Part 2 - Setting Up Scrapy

freeCodeCamp Scrapy Beginners Course Part 2: Setting Up Scrapy

In Part 2 of the Scrapy Beginner Course, we go through how to setup your Python environment along with installing Scrapy.

We will walk through:

The code for this part of the course is available on Github here!

If you prefer video tutorials, then check out the video version of this course on the freeCodeCamp channel here.

freeCodeCamp Scrapy Course

Need help scraping the web?

Then check out ScrapeOps, the complete toolkit for web scraping.


How To Install Python

For this course, we assume you already have Python installed and have a basic understanding of coding with Python.

However, if you don't then we recommend you follow the steps outlined in this video to install Python onto your machine.

How To Install pip

pip is a popular 3rd party package manager for Python. It allows you to quickly and easily install 3rd party packages to use lots of code from other people/companies in your Python projects.

You may already have pip installed as part of your Python installation. You can check this by running pip --version in your terminal/powershell command prompt. If something like the following is output to the screen it should be installed correctly:

pip 22.3.1 from /usr/local/lib/python3.9/site-packages/pip (python 3.9)

If not then you can install pip with the following command on MacOS/Linux operating systems (after python is installed!):

python -m ensurepip --upgrade

For windows machines:

py -m ensurepip --upgrade

Python Virtual Environments

To avoid version conflicts down the raod it is best practice to create a seperate virtual environments for each of your Python projects. This means that any packages(3rd party code/module) you install for a project are kept seperate from other projects, so you don't inadverently end up breaking other projects.

Depending on the operating system of your machine these commands will be slightly different.

venv comes "built in" as part of the latest version of Python 3 and makes it simple to setup and use virtual environments.


Setting Up Your Python Virtual Environment On Linux

Once you have Python installed, setting up a virtual environment on any Linux distro is pretty simple.

First, we want to make sure we've the latest version of our packages installed.

$ sudo apt-get update
$ apt install tree

Then install python3-venv if you haven't done so already.

$ sudo apt install -y python3-venv

Next, we will create and activate our Python virtual environment so that any new pip install commands will install into the new venv folder by doing:

$ cd /free_code_camp_scrapy (or what ever the name your project folder is)
$ python3 -m venv venv
$ source venv/bin/activate

Setting Up Your Python Virtual Environment On MacOS

On macOS just run the following commands:

$ cd /free_code_camp_scrapy (or what ever the name your project folder is)
$ python3 -m venv venv

We then activate the virtual environment so that any new pip install commands will install into the new venv folder by doing:

$ source venv/bin/activate

Setting Up Your Python Virtual Environment On Windows

Setting up a virtual environment on Windows is also pretty simple, but we will use virtualenv instead as venv can be more complicated to install on Windows.

Install virtualenv in your Windows command shell, Powershell, or other terminal you are using.

pip install virtualenv

Navigate to the folder where you want to create the virtual environment, and run the virtualenv command.

cd /free_code_camp_scrapy
virtualenv venv

We then activate the virtual environment so that any new pip install commands will install into the new venv folder.

source venv\Scripts\activate

How To Install Scrapy

With our virtual environment created and activated, now it is time to install Scrapy into it.

To do so we just need to install Scrapy via Pip:

pip install scrapy

To make sure everything is working, we can check if Scrapy was installed correctly by typing the command scrapy into your command line you should get an output like this:

$ scrapy

Usage:
scrapy <command> [options] [args]

Available commands:
bench Run quick benchmark test
check Check spider contracts
commands
crawl Run a spider
edit Edit spider
fetch Fetch a URL using the Scrapy downloader
genspider Generate new spider using pre-defined templates
list List available spiders
parse Parse URL (using its spider) and print the results
runspider Run a self-contained spider

If you get a output similar to the above then you know you have successfully installed Scrapy.


Next Steps

Now that we have our environment setup we will move onto creating our first Scrapy project.

All parts of the 12 Part freeCodeCamp Scrapy Beginner Course are as follows: