OpenPose is a pose estimator. I this tutorial, we will use their 2d keypoint detection model as an introduction to pose estimation. A 2D keypoint detector can identify different people within a frame and detects their key anatomical points, known as ‘Keypoints’. These Keypoints are anatomical landmarks in the body – most 2D keypoint detectors focus on the eyes, nose, ears, neck, shoulders, elbows, wrists, hips, knees and ankles. The sample skeleton from OpenPose can be seen in Figure 1, this demonstrates the 2D keypoints that OpenPose will identify for each subject present in a frame.
OpenPose, like all other pose estimators, is a machine learning algorithm which trains on a specific dataset, in this case COCO (To learn more about this process check out DOCUMENT X). The reason why we will use OpenPose to introduce you to pose estimation is that they have created a great tool to use it – a fully functioning script in Google Collab. For this tool, you don’t need any python experience, even though I recommend that if you enjoy it, you start learning the language as it’s essential for pose estimation.
Google Collab is a free online-based Jupyter-notebook environment, which means you can execute scripts and train algorithms using their CPUs, GPUs and TPUs. I would not recommend it for training unless essential as it takes longer than doing it in your own machine (contact Clara or Ciaran Simms if you need access to GPUs) but it works very well to run inference in a video with an already trained model.
The Google Collab for OpenPose can be found here: OpenPose.ipynb - Colaboratory (google.com)
A Jupyter-notebook is a script that gets divided in cells so that you can run sections separately. This is especially useful because loading the model can sometimes take 20-30 minutes in this script, so if you are going to be running more than one video, you do not want to have to run it all from scratch. You can instead replace the source video and rerun only that cell off the script that applies the model to the video.
As seen above in red, there are two types of cells, a text cell, which is a basic Markdown cell which gives some information on the script bellow, and a code cell which contains the script. To run a code cell, press the play button highlighted in green.
When you run this script for the first time, and until you get familiar with it, I recommend you run the full script before uploading your video.Scripts such as this one have specific requirements, and need a certain structure, which means that errors might arise if you forget to run a specific section, or if your video is stored in the wrong folder. In Figure 2, you can see the difference in folder structure before you run the whole script compared to afterwards.
The default video used in the script is 8 minutes long, which would take a considerable amount of time to run. To speed up the process, we will replace the video with a shorter one.
Use the YouTube ID 5HkF2YhWT1A to replace the default video:
To run the script, open the runtime tab (highlighted in blue in Figure 1) at the top and select ‘Run All’. This will run every script cell in order. Once the script has ran, the model will be loaded and can be applied to any .mp4 file without being reloaded (unless the runtime in the Colab gets disconnected – this happens if you leave the page idle for too long).
Once the script has ran, the section on the left-hand side of the page will populate with the output folders (figure 2b).
The cell shown below runs the pose estimation model on the video:
The first three lines deal with downloading and cropping the YouTube video. At this stage we have already loaded the model and will not be using a YouTube video anymore. We will instead load in our own .mp4 file, so to prevent this code from interfering with our file it’s best to comment out (highlight the first 3 uncommented lines and press crtl + /).
The line that starts with !cd openpose is the line which will run the model. As you can see highlighted in blue, it expects an .mp4 file called ‘video.mp4’. Therefore, to successfully run this line you will have two options; either name your file ‘video’ or changing the ../video.mp4 to ../nameofyourfile.mp4.
The final step you will take before running it again is uploading your video. For this is important for you to note where the video.mp4 was saved. To upload your video, use the upload button highlighted in red to upload to the same folder as video.mp4.
Once your video has run, it will create a folder inside openpose/ called output, where the 2D keypoints for each frame will be stored in separate .json files. To download all these files you can create a new code cell and run the following line:
!zip -r /content/output.zip /content/openpose/output
This will form a .zip file with all the .json files inside, and should store it in the same folder as the one you uploaded your video. You should be able to download this file by right clicking it. You should be able to use either Matlab or Python to read in these .json files. Each of them will contain an array with the 2D keypoints, however its important to note that