Python Scripts and Jupyter Notebooks
Example hyperparameters optimization results table for LightGBM Regressor on Boston Housing data.
Read complete article on Medium.
An example of telegram chats which contain keyword 'bitcoin' or 'btc'
Example:
>>> tele_btc_messages.head()
Example:
$ cd DS-ML-Public
$ python telegram_user_status.py 12345 fe3922d77g6wgwgwyu35g46c9 bitgrit
Number of active users in last 24 hours is 1530.
User status
0 Dayal Chand online
1 Sameer recently
2 Dikesh Shah 2019-07-02 01:13:19
3 Crypto 2019-07-02 00:47:50
4 Billy 2019-07-02 01:32:49
$ git clone https://github.com/dc-aichara/DS-ML-Public.git
$ cd DS-ML-Public/WebScrapers
$ python3
>>> from crypto_news_scraper import NewsScrap
>>> news = NewsScrap()
>>> df_coindesk = news.coin_desk_news()
>>> df_coindesk.head()
category heading ... time source
0 news Dapp.com Closes $1 Million Investment Round Le... ... 2019-09-06 22:00:00 CoinDesk
1 news Telegram Finally Releases Code for Its $1.7 Bi... ... 2019-09-06 21:46:00 CoinDesk
2 news Massive $1 Billion Bitcoin Whale Transaction M... ... 2019-09-06 19:00:00 CoinDesk
3 news Ethereum Picks Early October for Testnet Activ... ... 2019-09-06 18:00:00 CoinDesk
4 news Dapp Data Site DappRadar Raises $2.33 Million ... ... 2019-09-06 17:00:00 CoinDesk
[5 rows x 6 columns]
>>> df_cointelegraph = news.cointelegraph_news()
>>> df_cointelegraph.head()
category heading ... time source
0 News Crypto and Blockchain Adoption Grows: 5 Import... ... 2019-09-09 11:15:03 CoinTelegraph
1 News World’s ‘First’ Blockchain Smartphone to Becom... ... 2019-09-09 08:15:03 CoinTelegraph
2 News Ethereum's Istanbul Hard Fork Implementation D... ... 2019-09-09 08:15:03 CoinTelegraph
3 News Blockchain Startup DappRadar Raises $2.33M Fro... ... 2019-09-09 08:15:03 CoinTelegraph
4 News Huobi’s Research Arm to Partner with the Unive... ... 2019-09-09 07:15:03 CoinTelegraph
[5 rows x 6 columns]
>>> df_all = news.get_all_news()
Getting news from CoinDesk!!
Getting news from Cointelegraph!!
Getting news from cryptonewsz!! This will take 1-2 mintues. 😉
>>> df_all.head()
category heading ... time source
0 news Dapp.com Closes $1 Million Investment Round Le... ... 2019-09-06 22:00:00 CoinDesk
1 news Telegram Finally Releases Code for Its $1.7 Bi... ... 2019-09-06 21:46:00 CoinDesk
2 news Massive $1 Billion Bitcoin Whale Transaction M... ... 2019-09-06 19:00:00 CoinDesk
3 news Ethereum Picks Early October for Testnet Activ... ... 2019-09-06 18:00:00 CoinDesk
4 news Dapp Data Site DappRadar Raises $2.33 Million ... ... 2019-09-06 17:00:00 CoinDesk
[5 rows x 6 columns]
$ git clone https://github.com/dc-aichara/DS-ML-Public.git
$ cd DS-ML-Public/WebScrapers
$ python3
>>> from inshorts_news_scraper import InshortsNews
>>> news = InshortsNews('business')
>>> df_b = news.get_news()
>>> df_b.head()
headings news short_by time category
0 BSNL plans to fire 30% contract staff unpaid s... BSNL is reportedly planning to lay off about 3... Anushka Dixit 2019-09-09 23:35:00 business
1 SAT overturns SEBI's 2 year-ban on PwC in ₹7,8... The Securities Appellate Tribunal (SAT) on Mon... Anushka Dixit 2019-09-09 21:29:00 business
2 Nissan CEO Hiroto Saikawa to step down on Sept... Nissan CEO Hiroto Saikawa will step down on Se... Dharna 2019-09-09 21:08:00 business
3 British Airways pilots begin 2-day strike over... British Airways pilots began a two-day strike ... Anushka Dixit 2019-09-09 20:18:00 business
4 SEBI making e-voting app for retail investors ... Markets regulator SEBI is working on an e-voti... Dharna 2019-09-09 18:04:00 business
>>> df_all = news.get_all_news()
>>> df_all.head()
headings news short_by time category
0 Conflict between India, Pak less heated now th... Speaking about tensions between India and Paki... Arshiya Chopra 2019-09-10 08:50:00 national
1 Bengaluru woman loses ₹95,000 after calling fa... A Bengaluru woman lost ₹95,000 after calling a... Pragya Swastik 2019-09-10 08:25:00 national
2 IAS officer who resigned is traitor, should go... BJP MP Anantkumar Hegde has called IAS officer... Apaar Sharma 2019-09-09 23:28:00 national
3 Stop drama, stand up, CISF allegedly tells wom... Virali Modi, a disability rights activist, has... Anmol Sharma 2019-09-09 23:10:00 national
4 Tech firms may be allowed to sell users' publi... India is reportedly mulling guidelines which w... Dharna 2019-09-09 23:00:00 national
>>> from japanese_news_scraper import JapaneseNewsScrap
>>> jp_news = JapaneseNewsScrap(24*60*60)
>>> df_coinpost = jp_news.get_coin_post_news()
>>> df_coinpost.head()
time heading ... link source
0 2019-10-08 15:30:48 米リップル社、大学ブロックチェーン研究イニシアチブで年次大会を初開催 ... https://coinpost.jp/?p=111090 CoinPost
1 2019-10-08 15:29:29 米NBAのキングス、ファン向けの独自仮想通貨発行を発表 ... https://coinpost.jp/?p=111088 CoinPost
2 2019-10-08 14:59:53 金融庁がブロックチェーン実験結果を公表、金融機関の顧客KYC情報を共有 ... https://coinpost.jp/?p=111189 CoinPost
3 2019-10-08 14:26:52 Chainlinkの新フレームワーク発表で、仮想通貨LINKが高騰 協賛にIntelなど ... https://coinpost.jp/?p=111080 CoinPost
4 2019-10-08 14:04:36 イーサリアム企業連合、ブロックチェーン仕様の新バージョン発表 「Devcon 5」で検証実施 ... https://coinpost.jp/?p=111170 CoinPost
[5 rows x 5 columns]
>>> from website_pages_scraper import WebScraper
>>> webscraper = WebScraper(main_page_url='https://www.example.com/')
>>> df = webscraper.scrap_website(depth=2)