Last year I built a very smart crawler which gave me tons of ideas on building several applications. One of those applications was Kuai List. This service functioned through a smart agent which was able to grab non-spammy products and re-post them in form of a photo collage via a website. Although it wasn’t a big hit, I do recall the site gaining some momentum, mostly over the curiosity of how I built the platform. The challenges for that one-month project were immense but I kept figuring out how to circumvent every road-block I encountered.
I managed to learn a lot from the Facebook Graph API and why Facebook chose to re-construct their core API. First of all, it had a lot of vulnerabilities which developers and hackers were taking advantage of – do you remember those annoying game invitations? I won’t talk about that here because it will get a little bit too redundant. Instead, I will talk a little about the experience and what’s next.
Two things helped me in the process of building by smart crawler: tremendous determination and intensive research. I learned so much that I will be bold enough to say that I have become an expert at using Facebook’s API. I even found a few loopholes which I have no intention of mentioning here. It was my first ever smart crawler (bot) and I have to say it really made me see data differently. Since then, a lot has changed and there is data all over the place to mine besides Facebook. Through reminiscing over those days and also due to the rise in demand of smart agents (e.g., personal assistants and content automation), I have come up with a new challenge this year. For those who personally know me, you already know that I am a guy who loves to take on challenges with the main intention of learning. This challenge is like no other which I have undertaken in the past. The idea of the challenge is basically to create 12 different smart agents (bots), one for every month of the year.
I am already in the process of creating my first two bots, which I will be releasing as open-source by the end of this month. I will also be documenting the process and releasing videos such as this one. One of the bots include a project which I started last year, so practically that doesn’t count. I will be working on some very interesting ideas, some of which include the mining of information from sources like: Amazon, Twitter, Facebook and Medium. I will use different technologies to store this data as I see fit – that means you can also learn about database technologies (e.g., Casandra, MongoDB, Postgresql and GraphDB). Consequently, I will also be using some old and emerging technologies such as: Node.js, React.js, Nginx, RoR 5.0, Ruby, Python 3, Socket.io, Docker, Elasticsearch, Kibana and D3. I have very limited experience with using these technologies so that in itself will be a challenge for me. Additionally, I will also be aiming to build some cool services, applications and research projects from these smart bots, which you might find useful or related. Everything will be documented and open-sourced including data-sets. I have set up multiple channels for relaying this adventure which include: screencast, live-stream videos and blog articles. I have provided some links below.
Frankly, there is a lot of work up ahead so I will cut this article short and provide you with some links for you to track some of my work. I will also be open to suggestions and collaborations if anyone is interested. To be honest, this is a very challenging endeavor but one which I will try to enjoy and share with you guys as much as possible.
You can also learn more about one of my other challenges (1000 micro-learning flashcards) here.