Observer is a personal experiment with the following design goals:
Institutions are created to group sources.
observer_prod=> select * from media.institutions limit 3;
id | uuid | created | name | note | url_key | language | is_english
----+--------------------------------------+----------------------------+------+------+------------------------+----------+------------
| e3f346bf-5fd7-43a1-9fe2-db15be1d1081 | 2024-09-11 16:36:45.250154 | | | tempo.com.ph | | f
| 75f04c5b-70c4-4d0c-8d10-4c5b3ac229ff | 2024-09-11 16:36:45.251022 | | | www.threepanelsoul.com | | f
| dae2029a-7b8f-4a5d-bc98-2fd757385834 | 2024-09-11 16:36:45.2519 | | | rss.gazetaprawna.pl | | f
(3 rows)
Sources hold data about each url within the system.
observer_prod=> select * from media.sources limit 3;
id | uuid | created | name | note | source_url | fetch_method | handle_method | last_fetched | active | id_institution | language | is_english
-----+--------------------------------------+----------------------------+------------------------+------+-------------------------------------------------------+--------------+---------------+----------------------------+--------+----------------+----------+------------
| a55a1465-2003-4c50-9b64-00084e790977 | 2024-09-11 16:37:00.143056 | BBC News | | http://feeds.bbci.co.uk/news/world/asia/india/rss.xml | | | 2025-01-14 03:38:12.955364 | t | | | f
| 304936aa-fd58-49f9-a965-59cd463feac5 | 2024-09-11 16:37:05.365737 | CBC | Top Stories News | | https://www.cbc.ca/cmlink/rss-topstories | | | 2025-01-01 17:01:53.40127 | f | | | f
| 6892980c-04d9-4dab-9e8b-d167f9302fc6 | 2024-09-11 16:37:01.085708 | BBC News | | http://feeds.bbci.co.uk/news/world/rss.xml | | | 2025-01-14 03:38:13.213961 | t | | | f
(3 rows)
Items hold data about each article.
observer_prod=> select * from media.items limit 3;
id | uuid | created | title | description | text_content | posted | article_link | image_link | video_link | id_source | uri | id_institution | is_english | language
-----------+--------------------------------------+----------------------------+--------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------+---------------------+----------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------+------------+-----------+----------------------------------+----------------+------------+----------
| df313f63-babc-41d8-b8a1-d61a300a946b | 2024-12-21 07:35:05.128839 | Поиски пропавшего на Камчатке самолета Ан-2 приостановили из-за темноты | На Камчатке группировка спасателей, выдвинувшаяся на поиски пропавшего самолета Ан-2, остановилась в связи с наступлением темноты. Об этом сообщили в ГУ МЧС России по Камчатскому краю. | | 2024-12-21 07:30:00 | https://rg.ru/2024/12/21/reg-dfo/poiski-propavshego-na-kamchatke-samoleta-an-2-priostanovili-iz-za-nastupleniia-temnoty.html | | | | 50612de9ea4b8aa41d6057fcb46a8812 | | f | ru
| ee422536-2df7-4102-94ab-5670f1400e52 | 2024-11-22 09:03:18.851295 | Ile mogą dorobić niektórzy emeryci i renciści? Od 1 grudnia 2024 r. nowe progi | Znamy już tzw. limity dorabiania do świadczeń emerytalno-rentowych, które będą obowiązywały od 1 grudnia 2024 r. Warto jednak pamiętać, iż progi te nie dotyczą wszystkich emerytów i rencistów. Kto nie musi martwić się limitami? | | 2024-11-22 08:32:09 | https://serwisy.gazetaprawna.pl/emerytury-i-renty/artykuly/9674059,ile-moga-dorobic-niektorzy-emeryci-i-rencisci-od-1-grudnia-2024-r-no.html | https://ocdn.eu/pulscms-transforms/1/X6HktkuTURBXy8xYjkwMGEyYS03YzMyLTQyNDctOTY0NC1lMGJhMjQ1NTczZDkuanBlZ5GTBc0BHcyg | | | ad4f048db7b45433d958f9aa75ef4df7 | | f | pl
| 0e4ced1a-2fdd-4e29-a4f6-41b01b10a6b6 | 2024-11-22 09:03:21.35581 | Milan, occhio a Belahyane. Il CorSport in apertura: "Inter sul nuovo Calha" | Negli scorsi giorni alla mediana del Milan è stato accostato il nome di Reda Belahyane (QUI il focus di MilanNews. | | 2024-11-22 08:50:00 | https://www.milannews.it/rassegna-stampa/milan-occhio-a-belahyane-il-corsport-in-apertura-inter-sul-nuovo-calha-557126 | | | | 99fb4aaaa26135a79265dac234be21f3 | | f | it
(3 rows)
observer_prod=> select count(*) from media.items;
count
---------
1961729
(1 row)
Analysis tables fetch data from each article URL and store compressed analytics derived from the article content.
This system is built on top of the Luminus micro-framework.
The home-routes
handler in the observer.routes.home
namespace defines the route that invokes the home-page
function whenever an HTTP request is made to the /
URI using the GET
method.
(defn home-routes []
[""
{:middleware [middleware/wrap-csrf
middleware/wrap-formats]}
["/" {:get home-page}]
["/about" {:get about-page}]])
The home-page
function will in turn call the observer.layout/render
function to render the HTML content:
(defn home-page [request]
(layout/render
request
"home.html" {:docs (-> "docs/docs.md" io/resource slurp)}))
The render
function will render the home.html
template found in the resources/html
folder using a parameter map containing the :docs
key. This key points to the contents of the resources/docs/docs.md
file containing these instructions.
The HTML templates are written using Selmer templating engine.
<div class="content">
{{docs|markdown}}
</div>
learn more about HTML templating »
The routes are aggregated and wrapped with middleware in the observer.handler
namespace:
(mount/defstate app-routes
:start
(ring/ring-handler
(ring/router
[(home-routes)])
(ring/routes
(ring/create-resource-handler
{:path "/"})
(wrap-content-type
(wrap-webjars (constantly nil)))
(ring/create-default-handler
{:not-found
(constantly (error-page {:status 404, :title "404 - Page not found"}))
:method-not-allowed
(constantly (error-page {:status 405, :title "405 - Not allowed"}))
:not-acceptable
(constantly (error-page {:status 406, :title "406 - Not acceptable"}))}))))
The app
definition groups all the routes in the application into a single handler. A default route group is added to handle the 404
case.
The home-routes
are wrapped with two middleware functions. The first enables CSRF protection. The second takes care of serializing and deserializing various encoding formats, such as JSON.
Request middleware functions are located under the observer.middleware
namespace.
This namespace is reserved for any custom middleware for the application. Some default middleware is already defined here. The middleware is assembled in the wrap-base
function.
Middleware used for development is placed in the observer.dev-middleware
namespace found in the env/dev/clj/
source path.
Visit the official documentation for examples on how to accomplish common tasks with Luminus. The #luminus
channel on the Clojurians Slack and Google Group are both great places to seek help and discuss projects with other users.