Ticket #487 (closed: fixed)

Opened 9 years ago

Last modified 3 years ago

Client-side graphs drawing

Reported by: lukacu Owned by: mitar
Priority: minor Milestone: 3.0b
Component: nodewatcher/modules Version:
Keywords: projectideas, biggerproject Cc:
Related nodes: Realization state:
Blocking: 288, 437, 534, 764, 1008, 1124, 1125 Effort: high
Blocked by: 799 Security sensitive: no

Description (last modified by mitar) (diff)

In our node database system nodewatcher we extensively use graphs to display data about the network and nodes: how they behave, how they are used... Currently we use server-side library (RRDtool) to render graphs in advance. This has a drawback of being CPU hungry (on the server) for many graphs and that graphs are not interactive (so cannot be manipulated by user). So client-side graphs drawing (with HTML5 canvas tag or similar) seems a good solution.

An example of such graphs.

Some links:

In additional to client site drawing, a server side support has to be implemented. Because it is not possible to simply send all data points to clients, some aggregation of data has to be done, for different levels of "zoom". Then based on this aggregation some API has to be defined (datastream API or how we would call it), which drawing library will use to get data points.

Change History

comment:4 Changed 8 years ago by mitar

  • Milestone changed from Ideje za naprej to 3.0b
Last edited 8 years ago by mitar (previous) (diff)

comment:5 Changed 8 years ago by mitar

  • Milestone changed from 3.0b to Next milestone

comment:7 Changed 8 years ago by mitar

  • Blocked by set to 799

comment:8 Changed 8 years ago by mitar

  • Blocked by 799 deleted

comment:9 Changed 8 years ago by mitar

  • Keywords projectideas, biggerproject, gsoc added
  • Description modified (diff)
  • Blocked by set to 799
  • Summary changed from Vizualizacija grafov z HTML5 canvas tagom to Client-side graphs drawing

comment:10 Changed 8 years ago by Floh1111

Hi,

I did client side visualization for Netmon with javascriptrrd and flot. The implementation is just quick and dirty and static at the moment but if you are searching for ideas take a look at http://netmon.freifunk-ol.de/networkstatistic.php for an example and feel free to contact me at clemens[at]freifunk-ol.de

comment:13 Changed 7 years ago by mitar

Related: #922.

comment:14 Changed 7 years ago by mitar

  • Effort changed from normal to high

comment:15 Changed 7 years ago by mitar

  • Description modified (diff)

comment:16 Changed 7 years ago by mitar

Related: #437, #490, #534.

comment:18 Changed 7 years ago by mitar

Some good features:

  • (time)span selection on the graph
    • zoom into it
    • export selected aggregated datapoints
    • export raw data behind selected span
  • save displayed image (screenshot) (no idea how to do that)
  • link to the graph (permalink through anchor/hash options)

comment:23 Changed 7 years ago by lukacu

While working on graph drawing for node data visualization it would make sense to use the same library to visualize network topology there as well. Something like this or this could be a starting point.

comment:24 Changed 7 years ago by mitar

We have #736 for tracking development of topology drawing. Maybe you could put links also there?

Furthermore, Kostko has participated in one his project for development JavaScript graph layouting engine. ;-)

https://code.google.com/p/foograph/

comment:25 Changed 7 years ago by lukacu

What i wanted to point out is that it would be wise to select a library that can bo both things so that we can reuse it for topology as well. And perhaps write JSON topology export while writing node graph JSON api.

comment:26 Changed 7 years ago by mitar

Yes. We were thinking of something like this. To have a general drawing library, which could draw 1D or 2D plots, time-series, scatterplots and graphs.

Also on data feeding side it would be interesting. Because we should have data aggregation (if you "zoom out").

Furthermore, also for 1D and 2D data and also for graphs we have often a "time" component. So time-series could be seen as one way of drawing 1D data with "time" component. We could also have n-D data with "time" component, like we currently draw multiple measurements/values on the same time-series plot.

But for scatterplots and graphs "time" component could be some slider which moves you into the past. This could be interesting to animate, too. For example, to animate how network topology grows through time, by moving this "time" component.

So graph drawing should be dynamic, it should know how to add and remove nodes and just minimally change the layout, so that changes between time snapshots are gradual.

comment:27 Changed 7 years ago by mstajdohar

  • Status changed from new to assigned
  • Owner changed from lukacu to mstajdohar

comment:28 Changed 7 years ago by kostko

What we first need to define is an API that will be used with the data archival system. As far as I see it, there are two basic operations:

  • insert for adding new data points to the time series. This will be called by monitoring processors after performing measurements with the node and the data instance (which is a registry item in the monitoring schema).
  • query for performing different kinds of queries. This part is less specified, so we should first document what kind of queries will be performed on the data archive.

After defining these two operations, an abstract DataArchiveBackend can be added to nodewatcher with a default "null" implementation (that does nothing and returns no results to all queries). Then we can decide on what will be the default backend (SQL, MongoDB, Cube, OpenTSDB, ...) and what additional stuff needs to be done there (like aggregations).

Last edited 7 years ago by kostko (previous) (diff)

comment:29 Changed 7 years ago by mitar

OK, some list of visualizations we currently use:

Because we will now have interactive graphs, it would be useful to be able to turn on/off which values you want displayed on the graph. Also to add values from other graphs. Probably the best would be to have some graphs predefined when user opens the page, but user can also create its own graph based on all values provided by the node.

We would also like to be able to:

  • display node events: #534
  • data aggregation should be done smart: #437 (upstream ticket)
  • export data (whole or selection)
  • zoom in/out (both in time and amplitude), zoom in to selection, zoom out to whole, pan through time and amplitude
  • different scales: linear, logarithmic

So with aggregation we could be smart. For example, we could start with only one value when you look at maximum zoom level. But once you move out, values could be displayed as an area: minimum, maximum and average value. Aggregation for states should also be done better than it is currently (see above).

Be careful with this issues we currently have: #478, #547

Also make sure that settings of what is displayed are in the URL (anchor) so that you can copy/paste URL and somebody else get's the same page content.

This is for current 1D graphs. For scatter plots and other things similar ideas should apply. Some were already mentioned above.

comment:30 Changed 6 years ago by mitar

Have you checked Cubism.js precisely? It has some really good points:

  • It does not require page reload to get new data, but data is polled automatically. I would really like to see this too. (But our poll rate will be much smaller, 5 minutes probably for most graphs, maybe 1 minute for graphs where we can locally get data.)
  • It uses horizon charts to display more information in a limited vertical space, a very interesting approach but it is probably not the best as a general display because users who are not used to this way of plotting values will not understand it. So I would not use this for default graphs, but for the mode where you want to compare (manually choose input values, as I wrote before, so some graphs should just be displayed before, and you should be also able to manually select what you want to see/watch, maybe on some subpage) I would use something like this.
  • Using Color Brewer for colors.

comment:31 Changed 6 years ago by mstajdohar

This is exactly what I use in the example here. http://seritest.ainda.si/test/. The pull time is now fixed to 5 min. But it actually depends on the data step (zoom). In the interactive version the user can zoom in and out, which changes the step and also the update time.

The color are now set manually in the script, but I suppose I could use Color Brewer as well.

Last edited 6 years ago by mitar (previous) (diff)

comment:32 Changed 6 years ago by mitar

Hm, good question if we really want to use MongoDB for backend or rather use OpenTSDB?

comment:33 Changed 6 years ago by mitar

Miha, the link above does not work. I fixed it. ;-) Is now correct?

comment:34 Changed 6 years ago by mstajdohar

The link is ok. Today I made my first interactive network in D3, which we will eventually use to visualize the router network. The use of the D3 lib is completely different from Cubism so I suggest this should be a separate project and to move it to a new ticket. I think a general library for so diverse visualizations is not a practical idea. I suggest we discuss where to put the new code at lunch next week.

comment:35 Changed 6 years ago by mitar

There is a completely other ticket (#736), and even more, it was also a completely other GSoC idea. You said you will do plots, not topology. But of course you can do both. ;-)

I think we would need the same API for data storage and streaming, but JavaScript part should simple use the code which makes best visualization. So API for getting data is the same (API should be data-agnostic, it simply returns (possibly aggregated) samples in time, whatever are those. And JavaScript then displays them.

comment:36 Changed 6 years ago by mitar

Furthermore, as I explained above, I don't believe that we should horizon charts as a default graph display. Only when you want to compare different values at the same time.

comment:37 Changed 6 years ago by mitar

Also, the API should not be based on polling. This does not scale. Also, for different data sources there are different intervals how data will be provided. Only currently we are doing it in fixed 5 minutes intervals (and even this not exact). Not to mention that if we want to use the same API in other projects, we simply cannot do the polling. So don't do it like that. Use django-pushserver.

comment:38 Changed 6 years ago by mitar

Interesting, a time-series cloud provider. They have a Python library.

We could get some interesting ideas from its API. Also for our HTTP API protocol. They are using RESTful API for that.

comment:41 Changed 6 years ago by mitar

comment:42 Changed 6 years ago by mitar

I like NVD3.js.

comment:43 Changed 5 years ago by mitar

comment:45 Changed 5 years ago by mitar

I am leaning to use Highstock. It is not D3 based, but I have used it in another project and it works pretty well. It already has almost all the features I would like to have. It might not be the prettiest by default. The issue is that it is not really extendable. So we should see if it supports:

  • having horizontal lines for events (node reboots), it does have flags though
  • having background be displayed as one color, which changes based on the state (for example we could map node states, up, down and so on, to background color)

comment:47 Changed 5 years ago by mitar

Both of those two features are already available.

comment:48 Changed 5 years ago by kostko

  • Blocking set to 1008

comment:49 Changed 5 years ago by kostko

  • Blocking changed from 1008 to 764, 1008

comment:50 Changed 5 years ago by mitar

Interesting open time-series (financial) data API: Quandl

comment:52 Changed 5 years ago by mitar

So we should define some tag types which would not be reserved in datastream, but our client-side code in django-datastream would still use as a hint what to do initially. So I imagine that we want on node to display few graphs one below the other, like we are doing now. But what would be cool is that what exactly is displayed and how is automatically configured based on tags. So we would make a query for particular node and tags would tell, if by default:

  • should a particular stream be displayed by default or only on demand
  • should particular streams be combined together into one graph
  • should initially be displayed as a whole (maximum zoom out) or for example just last week or something
  • when zooming out, which aggregation values should be display (only mean, only max, multiple)
  • type of visualization: line, stack, background, discrete events

We do not store empty datapoints, so if we want to have empty space drawn instead of visualization trying to interpolate between points we have to hint at what is expected interval between points, if any. I would like to make this independent from highest granularity, because I already said at other occasions, highest granularity is for me just an optimization parameter. But of course when setting tags you can set both (hint and highest granularity) to the same value if you see this suitable. Then client side can see if there are any holes larger than this hint and add artificial fake empty values in between.

comment:53 Changed 5 years ago by mstajdohar

This tags are added to the stream, I presume. So a certain stream is always supposed to be visualised in a particular (and by tags defined) manner?

comment:54 Changed 5 years ago by mitar

  • Status changed from assigned to accepted
  • Owner changed from mstajdohar to mitar

No, this is just what are default settings when you add the stream to visualization. So I envision two interfaces:

  • one where you can list all available streams and add them to visualization, they are added with those settings based on tags
  • one where you can say, give me for example all streams linked to this node, and then based on settings in tags it renders them (for example, some streams might have a tag to not even render them by default in such case)

This is what you will get if you will open page for a particular node, for example. But then you will be able to interact with that and change how graphs are displayed.

BTW, check current test version. :-)

Last edited 5 years ago by mitar (previous) (diff)

comment:56 Changed 5 years ago by mitar

One more thing we should store in tags:

  • suggested order of streams when visualizing more of them one next to the other (maybe we should just provide some weights?)
  • stream description (what we have currently under some graphs)

So the idea is that with metadata system can automatically create graphs we have now: know which one to combine, know which one to display, and in which order, and put descriptions below.

comment:57 Changed 5 years ago by kostko

Stream description is language-dependent so it probably should not be just saved into the database as it has to be adjusted when displayed? Or how do you think we should handle this?

comment:58 Changed 5 years ago by mitar

I think translations are then done at another layer. If somebody wants translations, then description stored in the stream can be used as key into translation table. So description can be seen as string inside _(...) function.

Last edited 5 years ago by mitar (previous) (diff)

comment:59 follow-up: ↓ 60 Changed 5 years ago by kostko

What should the naming convention for these visualization-related tags be? Should there be a specific prefix or should they simply be unprefixed? How should they be named? I have currently parsed the need for the following tags from the above text:

  • hidden (boolean) – should the visualization be hidden by default
  • interval (enum of all|week|day) – default visualization interval
  • aggregations (set of enums of supported downsamplers) – default aggregations when zooming out
  • type (enum of line|stack|...) – default visualization type
  • order (integer) – display order weight
  • description (string) – description of the stream in English language

I am not exactly sure about that "combine multiple streams" hint. Since this affects multiple streams, which stream should have this to ensure the best consistency? Because in my opinion the best thing would be to have a virtual stream that has no data, only tags that describe which other streams should be combined when visualizing. In this way, all metadata about stream combinations is stored in one place.

Or did you have something else in mind?

comment:60 in reply to: ↑ 59 Changed 5 years ago by mitar

Replying to kostko:

What should the naming convention for these visualization-related tags be? Should there be a specific prefix or should they simply be unprefixed? How should they be named?

We support subdocuments as tags, so I would suggest we have a top-level visualization tag which as a value has a dict with all visualization related key/value pairs.

  • hidden (boolean) – should the visualization be hidden by default

OK.

  • interval (enum of all|week|day) – default visualization interval

I was thinking later that this is unnecessary. Because it has to be synced between all graphs anyway. So it would be really strange if one graph would have one week interval, and another a different one. It would be misleading.

  • aggregations (set of enums of supported downsamplers) – default aggregations when zooming out

Set or list? :-) Set stored as list? ;-) I would name it singular, aggregation.

  • type (enum of line|stack|...) – default visualization type

Line, stack, state, point, event. (Point has a reasonable Y value, but does not connect with a line. Event has just a reasonable time event, but not really a Y value: reboot and warnings/errors streams. The later can have a list of JSON objects as values.)

Later on we would have graph type as well (for topologies).

  • order (integer) – display order weight

Should it be weight? Or order_weight?

  • description (string) – description of the stream in English language

OK. Description should be a top-level value. Not inside visualization key.

  • scale (enum of linear|logarithmic2|logarithmic10)
  • minimum (float) – known minimum value (for example, if you want that the graph is anchored at 0, even if values at a given interval start higher)
  • maximum (float) – known maximum value (for example, for LQ and ILQ we could set both min and `max)

I am not exactly sure about that "combine multiple streams" hint. Since this affects multiple streams, which stream should have this to ensure the best consistency? Because in my opinion the best thing would be to have a virtual stream that has no data, only tags that describe which other streams should be combined when visualizing. In this way, all metadata about stream combinations is stored in one place.

Or did you have something else in mind?

I am not sure if this can be solved by virtual streams. The idea is that user can add and remove streams as wanted and then if he or she adds two streams which have metadata that they should be displayed together, they snap together and are visualized as one. But user should be able to add only one. In virtual streams idea you can add a virtual stream, but if you add both linked, nothing happens. So I am not so sure about the virtual stream idea.

I would not complicate here with consistency. I would simply have a key

  • with (list of query tags) – list of query tags for which this stream is visualized together, if they are being visualized as well

For example, this can be simply list of ids, if you want to fix on the id (is id a tag as well? it should be). Or some query which says ID of the node and for example all network interface streams. So any network interface stream added to the visualization would be rendered together, for example.

comment:61 Changed 5 years ago by mitar

So currently used tags are:

  • node – node's UUID
  • registry_id – registry key ID
  • title (string) – title of the stream, in English language
  • description (string) – longer description of the stream, in English language, HTML allowed
  • unit (string) – base SI unit
  • unit_description (string) – additional description for the unit, or description for a special unit when unit itself is not applicable, in English language
  • group (string) – arbitrary (internal) group name to which this stream belongs, useful when combining multiple streams together using with
  • visualization
    • type (enum of line|stack|state|point|event) – default visualization type
    • hidden (boolean) – should the visualization be hidden by default
    • value_downsamplers (list of enums of supported value downsamplers) – default value downsamplers used when zooming out
    • time_downsamplers (list of enums of supported time downsamplers) – default time downsamplers used when zooming out
    • with – (query tags) – query tags for which this stream is visualized together on one graph, if they are being visualized as well
    • weight (integer) – display order weight, larger weights are listed before lower weights
    • scale (enum of linear|logarithmic2|logarithmic10)
    • minimum (float) – known minimum value (for example, if you want that the graph is anchored at 0, even if values at a given interval start higher)
    • maximum (float) – known maximum value (for example, for LQ and ILQ we could set both min and max)

We should think how to make reboot stream be visualized everywhere. Maybe simply with={}. :-)

BTW, I would, just to make sure, always in with add the node UUID to match over as well. Only if things are really network-wide I will not do that. So if we will at some point have a page displaying multiple streams from multiple nodes together. So probably 'node': fields.TagReference('node') is necessary?

Last edited 5 years ago by mitar (previous) (diff)

comment:62 Changed 5 years ago by kostko

BTW, I would, just to make sure, always in with add the node UUID to match over as well. Only if things are really network-wide I will not do that. So if we will at some point have a page displaying multiple streams from multiple nodes together. So probably 'node': fields.TagReference('node') is necessary?

Yes, a tag reference to node is necessary. I will update the existing tags.

comment:63 Changed 5 years ago by mitar

And for reboot stream we should do that with as well.

comment:64 Changed 5 years ago by mitar

We should add visualization tag which would suggest color. Because now all graphs are same color.

comment:65 Changed 5 years ago by mitar

It seems we missed one feature. Currently, for graphs for each link, we allow selection which link to display, showing average as a default. I think this is a needed feature because otherwise there are simply too many graphs to display per node. Suggestions how to encode this into tags? Maybe as primary/secondary? Or parent/child?

comment:66 Changed 5 years ago by mitar

  • Keywords biggerproject added; biggerproject, gsoc removed
  • Component changed from nodewatcher/core to nodewatcher/modules

Removing for now from GSoC ideas list.

comment:67 Changed 4 years ago by kostko

  • Milestone changed from Next milestone to 3.0b

comment:68 Changed 4 years ago by kostko

  • Blocking changed from 764, 1008 to 764, 1008, 1124

comment:69 Changed 4 years ago by kostko

  • Blocking changed from 764, 1008, 1124 to 764, 1008, 1124, 1125

comment:70 Changed 4 years ago by kostko

  • Blocking changed from 764, 1008, 1124, 1125 to 288, 764, 1008, 1124, 1125

comment:71 Changed 4 years ago by kostko

  • Blocking changed from 288, 764, 1008, 1124, 1125 to 288, 437, 764, 1008, 1124, 1125

comment:72 Changed 4 years ago by kostko

  • Blocking changed from 288, 437, 764, 1008, 1124, 1125 to 288, 437, 534, 764, 1008, 1124, 1125

comment:73 Changed 3 years ago by mitar

I think this was implemented. We use now our datastream API and django-datastream HTTP REST interface and JavaScript code. We use Highcharts/Highstock to render charts on the client side.

We currently support line charts, with min/max range around them when on lower granularities. We support event streams which we can overlay over existing other charts. And we support stacked area charts.

Streams configure how they are rendered through the use of stream tags. Specifically, configuring the visualization tag.

So currently used tags are:

  • node – node's UUID
  • registry_id – registry key ID
  • title (string) – title of the stream, in English language (will be used for translation later on)
  • description (string) – longer description of the stream, in English language (will be used for translation later on), currently not used when rendering the stream in nodewatcher
  • unit (string) – base SI unit
  • unit_description (string) – additional description for the unit, or description for a special unit when unit itself is not applicable, in English language (will be used for translation later on); it is used to display the title of the y-axis
  • label (string) – event streams do not have units, but have a label displayed on their marks, in English language (will be used for translation later on)
  • message (string) – a longer message displayed for event streams when hovered over the event mark, in English language (will be used for translation later on)
  • group (string) – arbitrary (internal) group name to which this stream belongs, useful when combining multiple streams together using with
  • visualization
    • type (enum of line|stack|state|graph|event) – default visualization type
    • initial_set (boolean) – should the stream be rendered among nodewatcher node graphs by default (later on we will support users to add non-initial graphs manually)
    • hidden (boolean) – should the stream be hidden by default, it is still added to the chart, but it starts hidden, user has to show it through the chart legend
    • value_downsamplers (list of enums of supported value downsamplers) – default value downsamplers used when zooming out
    • time_downsamplers (list of enums of supported time downsamplers) – default time downsamplers used when zooming out
    • with – (query tags) – query tags for which this stream is visualized together on one graph, if they are being visualized as well; stream can be visualized together with multiple other streams
    • weight (integer) – display order weight, larger weights are listed before lower weights (not yet used)
    • scale (enum of linear|logarithmic2|logarithmic10) (not yet used)
    • minimum (float) – known minimum value (for example, if you want that the graph is anchored at 0, even if values at a given interval start higher)
    • maximum (float) – known maximum value (for example, for LQ and ILQ we could set both min and max)
Last edited 3 years ago by mitar (previous) (diff)

comment:74 Changed 3 years ago by mitar

Known issues: #1336.

comment:75 Changed 3 years ago by mitar

  • Status changed from accepted to closed
  • Resolution set to fixed
Note: See TracTickets for help on using tickets.