Project on building a web crawler to collect the fundamentals of the stock and review their performance in one go
Stock investing, if done properly, yields better returns in the long run when compared to other conservative investment vehicles. Success comes from choosing stocks with solid fundamentals out of thousands. However, finding these stocks is as difficult as finding a unicorn. Picking a fundamentally sound stock involves investigating stocks from different angles such as evaluating fundamental ratios, company management analysis, product impact in the consumer market, about its competitors, and many more. Every step is crucial in deciding the stock we want to invest in and each part demands a substantial amount of time. So doing these steps on every stock listed is not a good idea. We have to choose only a handful of potential stocks based on certain criteria usually set on the performance of profitability indicators. Reviewing fundamentals of stock only at the current year does not reveal much and sometimes may also be misleading because for some reason the situation may be different in that financial year. Instead, we need to take a look at performance indicators over the past few years to get a clear picture of the company’s performance.
I took up this project to make the initial stocks screening process and the most important aspect i.e reviewing the trend of performance indicators for Indian companies. Reviewing company fundamentals involves understanding how well the company has performed over the past few years by looking at annual figures and also trends. Again, doing it manually is a hectic job as it involves collecting the list of stocks that meet our criteria, going over each stock page that is on our screened list, fetching historical data of performance indicators, and also plotting them to understand the trend. I decided to build a ‘web crawler’ in python that does all these tasks in one go. Just to summarize, the objective of this project is to choose the best value stocks on stocks screened based on criteria and reviewing the historical performance of these.
This approach involves the following steps:
In this step, we initialize the selenium web driver and use that to log in to the web server by submitting our credentials. Screener.in is the data source and the Login link is provided below. https://www.screener.in/login/
Once we successfully get inside the server now we have access to the data. now we can run our query to filter the stock that passes our desired criteria. I have set a simple query that 'market capitalization > 0'. After running this query it lists out all the companies that have a market capitalization above zero. Below is the screenshot of the resulting page.
query link: https://www.screener.in/screen/raw/?sort=&source=&order=&page=1&query=Market+Capitalization%3E0
Notice that 3879 results passed the criteria set by us and they are stored across 156 pages. we need to insert the page number in the query link embedded at "&page=1&" to crawl across the pages 1-156 to get all the resulted stock links. Now, we are on page 1, let's collect all the stock page links and store them in the list. the links can be scraped by extracting 'href' tags associated with stock links using the 'bs4' package. We need to visit each stock page to source the data from it. It's done by creating beautifulsoup object of page, then locating tags that correspond to the data we are interested in and storing the data in an array format. below scroll over the example to get a glance at the web page.
The page contains several tables of historical data of parameters that describe the past performance and financial health of the company. But I have considered only a few indicators based on my intuition that decides the stability and profitability of a company in the competitive environment in the long run. You will see the selected indicators on the plot generated in a while.
Collecting just numbers won't tell much as it is difficult to interpret just looking at the number. we can create visual plots on the fly that tells the story about the company and may give hints on where it is heading in the future. Below I have added a trend plot of performance indicators for Avanti Feeds Ltd company as an example case. The company is mainly into aquacultural feed manufacturing business along with the production of value-added products of shrimps.
I will present here some of my insights about its management. Let's go over them.
These insights just scratch the surface there is more to it like understanding interactions between indicators give even better insights. Reading plots and developing a story is an art that can be mastered by practice and experience. It is advised to go over the past few year financial reports to know the actual reasons behind changes observed in plots.
Now we scraped data for the first company on the first page. this exercise has to be repeated for all the stock links on the first page, then move on to the second page and so on iteratively.
Let's put our webcrawler into action. For demonstration purpose I made to the crawler to login into the source, visit first three pages and in every page, its performs scraping on the first three stock pages, and log out from the source. Below is a clip of it.