Categories
Clojure

Parsing Upwork Job Feed to Monitor Clojure Jobs

I was checking Upwork to asses the job market for Clojure and it hit me – I can parse the Upwork Job Feed for Clojure and monitor it programmatically. So I fired up the REPL and started coding.

Before I began, I had to choose a Clojure library to parse RSS feeds. I went for https://github.com/scsibug/feedparser-clj. So I added this dependency ([org.clojars.scsibug/feedparser-clj "0.4.0"]) to my project.clj:

(defproject cljnoob "0.0.1"
  :description "A very simple project to learn Clojure!"
  :main cljnoob.core
  :profiles { :dev {:dependencies [[org.clojure/tools.namespace "0.2.11"]]
                  :source-paths ["dev"]}}
  :dependencies [[org.clojure/clojure "1.8.0"]
                 [org.clojars.scsibug/feedparser-clj "0.4.0"]])

Now we can start writing some codes. First, we would fetch the content of the RSS feed and parse it. The parse-feed function from the above mentioned library would do that for us.

(def feed (parse-feed "https://www.upwork.com/ab/feed/jobs/rss?q=Clojure&api_params=1"))

Next, we need a function to extract the data we need. We will run this function (map) over the collection of items.

(defn extract-title-and-url
  [item]
  (assoc {} :title (:title item) :url (:uri item)))

Here we’re simply getting the values of :title key and :uri key and putting them in another hashmap. We’re naming our key :url instead of their :uri

We can grab the collection of items in the :entries key of the feed variable we declared before. So here’s our main function:

(defn -main
  []
  (let [jobs (map extract-title-and-url (:entries feed))]
    (doseq [item jobs] (println (str (:title item) " - " (:url item))))))

We’re mapping the function we wrote over the entries and getting a collection of hashmaps. Then we’re using doseq to iterate over them and print the data out.

The final code looks like this:

(ns cljnoob.core
  (:require [feedparser-clj.core :refer [parse-feed]]))

(def feed (parse-feed "https://www.upwork.com/ab/feed/jobs/rss?q=Clojure&api_params=1"))

(defn extract-title-and-url
  [item]
  (assoc {} :title (:title item) :url (:uri item)))

(defn -main
  []
  (let [jobs (map extract-title-and-url (:entries feed))]
    (doseq [item jobs] (println (str (:title item) " - " (:url item))))))

Here, we have extracted only two fields and printed them out. We extracted the data into a new hashmap as an example. As a matter of fact, we could just print them out from the original feed variable. Then the code would have been shorter:

(ns cljnoob.core
  (:require [feedparser-clj.core :refer [parse-feed]]))
  
(defn -main
  []
  (let [jobs (:entries (parse-feed "https://www.upwork.com/ab/feed/jobs/rss?q=Clojure&api_params=1"))]
    (doseq [item jobs] (println (str (:title item) " - " (:uri item))))))