You wanted to know how to fetch json data from the below API using flume:
If you try to fetch the date from the URL using http source, then it will not fetch the data because you will not be able to access http port 80 of some one else. It will never be open for public access. And hence will not result into anything.
I used exec as a source. In this we are executing a python script to scrape the data and feed the data to hdfs using flume.
This has executed on our end. Please follow the below steps:
1) Please ensure you have python installed on your VM and firefox browser.
2) Please install pip using the below command:
sudo yum install python-pip
3) Please install beautifulsoup and selenium packages:
sudo pip install beautifulsoup
sudo pip install selenium
4) Now download the python script scrape_weather.py attached along with this email and save it in Desktop.
5) Now download the flume.conf file attached along with this email and copy & paste the content into flume.conf file of your flume.
6) Finally run the command from the terminal:
./flume-ng agent -n WeatherAgent -c conf -f /usr/lib/flume-ng/apache-flume-1.4.0-bin/conf/flume.conf
** I am assuming that you are using EdurekaVM
This will fetch the date from the API and will give you correct picture. This is the best help we can extend considering the query and the requirement you have.
I hope this helps. Kindly take it ahead from here.