What is Observer?

Observer is a personal experiment with the following design goals:

  1. Collect data from a variety of sources.
  2. Sort, correlate, and weight sources and data.
  3. Apply algorithms to identify emergent topics and their sources.
  4. Warn (paying subscribers) as novel ideas and occurrences emerge.


Developer Notes

Institutions are created to group sources.

observer_prod=> select * from media.institutions limit 3;
 id |                 uuid                 |          created           | name | note |        url_key         | language | is_english 
----+--------------------------------------+----------------------------+------+------+------------------------+----------+------------
    | e3f346bf-5fd7-43a1-9fe2-db15be1d1081 | 2024-09-11 16:36:45.250154 |      |      | tempo.com.ph           |          | f
    | 75f04c5b-70c4-4d0c-8d10-4c5b3ac229ff | 2024-09-11 16:36:45.251022 |      |      | www.threepanelsoul.com |          | f
    | dae2029a-7b8f-4a5d-bc98-2fd757385834 | 2024-09-11 16:36:45.2519   |      |      | rss.gazetaprawna.pl    |          | f
(3 rows)

Sources hold data about each url within the system.

observer_prod=> select * from media.sources limit 3;
 id  |                 uuid                 |          created           |          name          | note |                      source_url                       | fetch_method | handle_method |        last_fetched        | active | id_institution | language | is_english 
-----+--------------------------------------+----------------------------+------------------------+------+-------------------------------------------------------+--------------+---------------+----------------------------+--------+----------------+----------+------------
     | a55a1465-2003-4c50-9b64-00084e790977 | 2024-09-11 16:37:00.143056 | BBC News               |      | http://feeds.bbci.co.uk/news/world/asia/india/rss.xml |              |               | 2025-01-14 03:38:12.955364 | t      |                |          | f
     | 304936aa-fd58-49f9-a965-59cd463feac5 | 2024-09-11 16:37:05.365737 | CBC | Top Stories News |      | https://www.cbc.ca/cmlink/rss-topstories              |              |               | 2025-01-01 17:01:53.40127  | f      |                |          | f
     | 6892980c-04d9-4dab-9e8b-d167f9302fc6 | 2024-09-11 16:37:01.085708 | BBC News               |      | http://feeds.bbci.co.uk/news/world/rss.xml            |              |               | 2025-01-14 03:38:13.213961 | t      |                |          | f
(3 rows)

Items hold data about each article.

observer_prod=> select * from media.items limit 3;
    id     |                 uuid                 |          created           |                                     title                                      |                                                                                                             description                                                                                                             | text_content |       posted        |                                                                 article_link                                                                 |                                                      image_link                                                      | video_link | id_source |               uri                | id_institution | is_english | language 
-----------+--------------------------------------+----------------------------+--------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------+---------------------+----------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------+------------+-----------+----------------------------------+----------------+------------+----------
           | df313f63-babc-41d8-b8a1-d61a300a946b | 2024-12-21 07:35:05.128839 | Поиски пропавшего на Камчатке самолета Ан-2 приостановили из-за темноты        | На Камчатке группировка спасателей, выдвинувшаяся на поиски пропавшего самолета Ан-2, остановилась в связи с наступлением темноты. Об этом сообщили в ГУ МЧС России по Камчатскому краю.                                            |              | 2024-12-21 07:30:00 | https://rg.ru/2024/12/21/reg-dfo/poiski-propavshego-na-kamchatke-samoleta-an-2-priostanovili-iz-za-nastupleniia-temnoty.html                 |                                                                                                                      |            |           | 50612de9ea4b8aa41d6057fcb46a8812 |                | f          | ru
           | ee422536-2df7-4102-94ab-5670f1400e52 | 2024-11-22 09:03:18.851295 | Ile mogą dorobić niektórzy emeryci i renciści? Od 1 grudnia 2024 r. nowe progi | Znamy już tzw. limity dorabiania do świadczeń emerytalno-rentowych, które będą obowiązywały od 1 grudnia 2024 r. Warto jednak pamiętać, iż progi te nie dotyczą wszystkich emerytów i rencistów. Kto nie musi martwić się limitami? |              | 2024-11-22 08:32:09 | https://serwisy.gazetaprawna.pl/emerytury-i-renty/artykuly/9674059,ile-moga-dorobic-niektorzy-emeryci-i-rencisci-od-1-grudnia-2024-r-no.html | https://ocdn.eu/pulscms-transforms/1/X6HktkuTURBXy8xYjkwMGEyYS03YzMyLTQyNDctOTY0NC1lMGJhMjQ1NTczZDkuanBlZ5GTBc0BHcyg |            |           | ad4f048db7b45433d958f9aa75ef4df7 |                | f          | pl
           | 0e4ced1a-2fdd-4e29-a4f6-41b01b10a6b6 | 2024-11-22 09:03:21.35581  | Milan, occhio a Belahyane. Il CorSport in apertura: "Inter sul nuovo Calha"    | Negli scorsi giorni alla mediana del Milan è stato accostato il nome di Reda Belahyane (QUI il focus di MilanNews.                                                                                                                  |              | 2024-11-22 08:50:00 | https://www.milannews.it/rassegna-stampa/milan-occhio-a-belahyane-il-corsport-in-apertura-inter-sul-nuovo-calha-557126                       |                                                                                                                      |            |           | 99fb4aaaa26135a79265dac234be21f3 |                | f          | it
(3 rows)

observer_prod=> select count(*) from media.items;
  count  
---------
 1961729
(1 row)

Analysis tables fetch data from each article URL and store compressed analytics derived from the article content.


Luminus Framework

This system is built on top of the Luminus micro-framework.

System Organization

The home-routes handler in the observer.routes.home namespace defines the route that invokes the home-page function whenever an HTTP request is made to the / URI using the GET method.

(defn home-routes []
  [""
   {:middleware [middleware/wrap-csrf
                 middleware/wrap-formats]}
   ["/" {:get home-page}]
   ["/about" {:get about-page}]])

The home-page function will in turn call the observer.layout/render function to render the HTML content:

(defn home-page [request]
  (layout/render
    request 
    "home.html" {:docs (-> "docs/docs.md" io/resource slurp)}))

The render function will render the home.html template found in the resources/html folder using a parameter map containing the :docs key. This key points to the contents of the resources/docs/docs.md file containing these instructions.

The HTML templates are written using Selmer templating engine.

<div class="content">
  {{docs|markdown}}
</div>

learn more about HTML templating »

Routing

The routes are aggregated and wrapped with middleware in the observer.handler namespace:

(mount/defstate app-routes
  :start
  (ring/ring-handler
    (ring/router
      [(home-routes)])
    (ring/routes
      (ring/create-resource-handler
        {:path "/"})
      (wrap-content-type
        (wrap-webjars (constantly nil)))
      (ring/create-default-handler
        {:not-found
         (constantly (error-page {:status 404, :title "404 - Page not found"}))
         :method-not-allowed
         (constantly (error-page {:status 405, :title "405 - Not allowed"}))
         :not-acceptable
         (constantly (error-page {:status 406, :title "406 - Not acceptable"}))}))))

The app definition groups all the routes in the application into a single handler. A default route group is added to handle the 404 case.

learn more about routing »

The home-routes are wrapped with two middleware functions. The first enables CSRF protection. The second takes care of serializing and deserializing various encoding formats, such as JSON.

Middleware

Request middleware functions are located under the observer.middleware namespace.

This namespace is reserved for any custom middleware for the application. Some default middleware is already defined here. The middleware is assembled in the wrap-base function.

Middleware used for development is placed in the observer.dev-middleware namespace found in the env/dev/clj/ source path.

learn more about middleware »

Visit the official documentation for examples on how to accomplish common tasks with Luminus. The #luminus channel on the Clojurians Slack and Google Group are both great places to seek help and discuss projects with other users.