domingo, 5 de junio de 2016

CubesViewer 2.0.1 released!

I'm very proud to release CubesViewer 2.0.1, a major review of my database visual analytics application.

This is a major release of CubesViewer featuring tons of improvements, new features, a rebranded look and feel as well as a new code architecture that greatly eases development and paves the way for following versions.

CubesViewer has undergone a major upgrade. The code is now built upon AngularJS, and the UI framework has been migrated from jQueryUI to Bootstrap and Angular Bootstrap components. HTML has been rewritten and separated into easier to handle templates.

The application is now more responsive and mobile friendly and looks more stylish overall. CSS has been reworked and namespaced, easing integration into other web documents.

Migration to AngularJS has involved an comprehensive refactoring and review of every module, and we trust it's been for the better. Internally, the build pipeline now uses Less, Grunt and Bower, and a lot of dependencies have been removed. All together allow CubesViewer to now be distributed as a single .js file (minified version also available) and accompanying .css file. JSDoc has also been introduced.

Other additions feature:
  • Printer friendly CSS.
  • Export charts as images.
  • New horizontal bars chart.
  • Line and area charts with curved lines.
  • Improved error reporting and user interface.
  • CubesViewer Server (optional) upgraded to Django 1.9.
  • Plugin for cube usage tracking via Google Analytics.
  • Improved documentation and tutorials.
I hope to be soon publishing an open data site using CubesViewer. In the meanwhile, it's open source! Download it, use it, share it and contribute :).

martes, 8 de julio de 2014

Mis Apuntes de Sonido

He compartido mis Apuntes de Sonido, parece que han sido bien recibidos...

sábado, 16 de noviembre de 2013

He creado un sitio web con un visor para analizar los datos de los Presupuestos Generales del Estado.

Llevaba tiempo queriendo encontrar esa información y hacer algo así. La verdad es que es dificilísimo encontrar datos en formatos legible por máquinas. La mayoría de los datos publicados son PDFs o páginas web, imposibles de procesar. Resulta que hay fundaciones como OKFN que velan por que los datos públicos sean de fácil acceso, y resulta ¡oh sorpresa! que España queda bastante mal en el ranking de apertura y accesibilidad a datos públicos. La parte más interesante, el gasto, sigue siendo difícil de localizar. Sería estupendo poder ver a quién se adjudican qué contratos con facilidad, cosa que sí sucede en otros países como Eslovaquia.
Gasto per cápita en cada comunidad autónoma, 2006-2012. ¿Cómo es que Navarra tiene una barra tan grande? ¡¿Nos cuesta el doble un navarro que un murciano?!

Finalmente encontré que la gente de Civio ya habían hecho algo muy parecido procesando los PDFs de los presupuestos en ¿Dónde van mis impuestos?, y publicaron sus datos. Gracias a ellos pude preparar esta web. Me pregunto... ¿qué podríamos ver analizando el BOE y los diferentes boletines autonomicos y provinciales?

Al sitio aún le quedaría mucho por mejorar, pero es un comienzo :D. La herramienta usada para el análisis de datos de es un proyecto llamado CubesViewer, escrito por un servidor. Espero que resulte de interés a alguien.

Agradecimientos a Mateo y Pablo, que han apoyado la idea y colaboran proporcionando el hosting y el dominio.

domingo, 10 de febrero de 2013

Hooverphonic (Cover) - Mad about you

Hemos grabado una versión cortita de "Mad about you", de Hooverphonic. Éste es el segundo tema que produzco:

Voz: Xiana Teimoy
Instrumentos y Producción: J Montes

Desde mi perfil en SoundCloud se puede descargar el audio y navegar por los temas.

De paso recuerdo el anterior tema que publiqué: J - Nowadays .

sábado, 19 de enero de 2013

Wikipedia Facts

What can we find if we download and generate some statistics about Wikipedia?

1) Overview: Wikipedia on January 2nd 2013 has 13 057 082 entries (the Encyclopaedia Britannica sums 228 274 entries according to Wikipedia itself). There are almost as many redirections as actual articles:

Wikipedia Articles (blue: articles, green: redirections)
2) Articles: Let's look at the real articles contents only (no redirections). This is more than 7,1 million articles:

The average article belongs to 2.6 categories, links to 2.31 pages outside the site and 33.86 links internal to Wikipedia. Thousands of articles feature thousands links. In total, there are 16 413 888 external links and 240 751 315 internal links. Enough to get lost for a while!

The average size is 4 634 characters (roughly about 110 words per article), but the total size of article text is 32 GB. And this is only raw text, images are not included.

3) Content

This is one of the most striking result of all. I have searched for certain words within the text of the articles, and assigning a score. The follownig diagram shows how many articles are defined by a particular word (the word with most occurrences). This has been shown before, but the results still seem astonishing to me:

Perhaps we should start thinking of how airily we use the term "war".

4) Geography

I can only report for the articles last updated by anonymous users. But for the sake of it, this is how real article updates (by anonymous users) were distributed among the different continents. This population includes 490 080 articles:

5) Updates:  This is a result I am pretty surprised of. The following graph shows the year and quarter was the time that articles were last updated (separating redirections, in yellow, from articles, in blue). Apparently, a huge percentage of articles have been updated during the last quarter of 2012, which could mean that Wikipedia is very lively and is being updated frequently, although this value seems to high to me, and so I wonder if this may be some automatic process updating wikipedia articles.

Unfortunately, I can't get the "creation date" of articles as the normal Wikipedia dump doesn't include that information.

6) Titles

The average title entry is 26.9 characters long. There are entries starting with every character you can think of: , Ɣ, ¢, £, § ... We can also see how entries are distributed along letters. Surprisingly, numbers 1 and 2 have got more entries than letters Q, X, Y or Z. Even also more than U and V.

7) Processing

I have done the analysis and charts with CubesViewer OLAP Data Explorer, an open source data explorer that I published a couple of weeks ago.

Processing the Wikipedia export file (a 39,6 GB XML file) took my computer more than 18 hours, although using 1 core only and I didn't spend any time optimizing this.

source: 8,95GB 18:35:54 [ 140kB/s]
bzcat: 39,6GB 18:35:57 [ 621kB/s] 
articles: 13,1M 18:35:57 [ 195/s]    

domingo, 13 de enero de 2013

New open source application: CubesViewer

I have recently been working on a data exploration and visualization tool, and I am very happy to announce the release of this new project to the public domain.

It is called CubesViewer, and it is an Online Analytical Processing (OLAP) exploration tool. In everyday words, it allows people to design and produce reports and charts about many kinds of data that can be extracted from a database (like contracts, invoices, climate, demography, scientific production, wikipedia articles, public spending, logistics...).

I wanted to use a simple Online Analytical Processing (OLAP) server and I found the Cubes project, a fantastic lightweight OLAP server which includes everything I needed.

It's always nice to publish Open Source software. I hope this is of use to people.

  • User Interface allowing for multiple views on-screen.
  • Cube explorer providing drilldown and cut operations.
  • Supports dimension hierarchies and date filtering.
  • Different types of charts and diagrams.
  • View management, sharing and saving.
  • Modular and extensible.

    Cubesviewer Project:

    domingo, 11 de noviembre de 2012

    "Información" sobre la "censura" de Youtube

    Hoy Youtube! me ha escrito. Dicen que uno de mis vídeos quizá contenga contenido "inapropiado". De hecho, ha sido prohibido en Alemania.

    Parece que alguien o algo (supongo que más bien algún programa) ha detectado que de fondo una canción comercial.

    No entiendo cómo esto puede ser considerado un agravio :(, está totalmente fuera de lugar, como cualquiera puede juzgar por sí mismo viendo el vídeo: "Levitron".

    Menos mal que, como me dice Youtube!, gracias al dios esto no penaliza mi cuenta. Cualquier día subo un vídeo y por sabe Dios qué motivo me quedo sin la cuenta de Youtube, y por qué no, sin la de Google, y ya de paso sin acceso a mi propio teléfono. Parece imposible, pero hay a quien Amazon le ha condenado sin acceso a sus libros... 

    Reproduzco el texto del email de Youtube (en la página de información sobre copyright dan algún dato más):
    Your video "Levitron", may have content that is owned or licensed by UMG, but it’s still available on YouTube! In some cases, it may be blocked, or ads may appear next to it.

    This claim is not penalizing your account status. Visit your Copyright Notice page for more details on the policy applied to your video.