Index by title

h1. CDD 1an, Ingénieur de Recherche, développeur C++/Qt, visualisation/machine learning pour données satellites

%{color:red}OFFRE POURVUE%

L'équipe plasmas spatiaux du "Laboratoire de Physique des Plasmas":http://www.lpp.fr (LPP) recrute un(e) ingénieur(e) de recherche en CDD pour une durée de 1 an, dans le cadre du développement du logiciel graphique "SciQLOP":https://hephaistos.lpp.polytechnique.fr/redmine/projects/sciqlop/wiki, de recherche et visualisation de données plasmas mesurées par des satellites dans le milieu interplanétaire et magnétosphérique. Alliant des méthodes d'apprentissage statistiques à une interface intuitive, efficace et moderne, SciQLOP sera un logiciel unique en son genre au niveau mondial.

h2. Contexte

Depuis des dizaines d'années, des missions satellitaires sont envoyées dans l'espace afin de mesurer les propriétés plasmas et électromagnétiques de notre environnement proche et interplanétaire. Ces données sont actuellement stockées dans de grandes bases publiques et décrites dans des formats standards. Leur exploration et l'extraction d'intervalles présentant des signatures d'intérêt physique est cependant encore difficile. La grande variabilité des signatures observationnelles liée au caractère très dynamique des systèmes mesurés rend les méthodes de recherches basées sur des règles fixes très peu efficaces. L'exploration visuelle est donc aujourd'hui quasi incontournable mais présente évidemment les inconvénients d'être peu reproductible, longue et fastidieuse. Ceci provient, en grande partie, du manque d'outils graphiques permettant de l'exploration de bases de données et manière intuitive, efficace et indépendante de la mission d'origine. Ce projet a pour but le développement d'un logiciel graphique permettant une telle exploration, et possédant en son coeur des méthodes d'apprentissage statistique, pour l'établissement intelligent et reproductible de catalogues de signatures d'intérêt scientifique. Un prototype open-source a déjà été élaboré au sein du laboratoire et s'appuie sur l'API C++ Qt. L'aspect "machine learning" du projet est réalisé dans le cadre d'une collaboration avec le "laboratoire de mathématiques appliquées de l'école Polytechnique":http://www.cmap.polytechnique.fr.

h2. Descriptif du poste et mission

Vous serez intégré à l'équipe Plasmas Spatiaux du LPP en tant qu'ingénieur de recherche. Vous aurez pour mission le développement de l'application SciQLOP. Outre l'interface graphique permettant de visualiser et cataloguer les données de manière intuitive et efficace, vous serez amené à chercher et proposer des solutions quant aux méthodes de machine learning permettant à SciQLOP d'apprendre à reconnaitre des signatures intéressantes et les proposer à l'utilisateur. Ce travail de recherche et développement sera le fruit de votre interaction avec les scientifiques du laboratoire, experts en analyse de ces données et des instruments qui les produisent, ainsi qu'avec les experts en machine learning du laboratoire de mathématique appliquée. Les points clés du développement sont :

Gérer les fichiers de données au format CDF issus de missions diverses

Visualisation de données au format différents (séries temporelles, images, 2D, 3D)

Passage du terminal python intégré vers l'intégration d'iPython

Module de gestion de catalogues virtuels de données

Module d'apprentissage statistique et création de catalogues

Création collaborative/réseau de catalogues

portabilité Linux/WIndows/Mac

Documentation

h2. Rémunération

2700€ brut mensuel, à préciser selon expérience.

h2. Emplacement

Vous serez basé au "Laboratoire de Physique des Plasmas, sur le campus de l'école Polytechnique":http://www.lpp.fr/Comment-venir-au-LPP, à Palaiseau.

p=. !https://hephaistos.lpp.polytechnique.fr/redmine/attachments/download/1094/jpg_plan_lpp_X-81667.jpg!

h2. Votre profil

Vous êtes motivés par le développement graphique et sensible à l'ergonomie de vos interfaces. Vous êtes très enthousiasmé à l'idée de développer un logiciel scientifique unique au monde et d'être l'un des précurseurs de l'utilisation du machine learning pour l'analyse des données spatiales in situ. Vous êtes une personne curieuse, avez un esprit d'initiative et faites preuve d'une grande autonomie. Vous aimez partager et le travail de groupe.

h3. Niveau de recrutement et expérience

h3. Compétences et expériences exigées

h3. Compétences souhaitées

Posséder les compétences suivantes est un grand atout :

h2. Nous contacter/postuler

"Nicolas Aunai, Alexis Jeandet":mailto:nicolas.aunai@lpp.polytechnique.fr?cc=alexis.jeandet@lpp.polytechnique.fr


h1. research engineer position (1 year), developer C++/Qt, visualization/machine learning for satellite data

%{color:red}POSITION FILLED%

The space plasma team of the "Laboratory of Plasma Physics":http://www.lpp.fr is hiring a research engineer for one year, for the development of the GUI application "SciQLOP":https://hephaistos.lpp.polytechnique.fr/redmine/projects/sciqlop/wik, dedicated to the research and visualization of in situ spacecraft data measured in the magnetosphere and interplanetary space. Gathering an intuitive and powerful user interface with machine learning methods, SciQLOP will be the first software of its kind for space data analysis.

h2. Context

For decades, satellite missions have been sent to space in order to measure plasma and electromagnetic fields in our nearby space environment. Although this data is continuously stored within large public databases in a single file format, exploring it and extracting intervals revealing signatures of physical interest remains quite difficult. The very dynamic nature of the observed systems results in a great variability of observational signatures, which makes methods based on a fix set of rules, no matter how complex they are, very inefficient. Visual exploration is therefore almost unavoidable but it comes with the drawbacks of being hardly reproducible, long and laborious, mainly because we lack the graphical tools that would for intuitive and efficient exploration independent of the mission from which the data originates. This project aims in developing such a graphical software, with at its core, machine learning methods enabling feature recognition and smart cataloging of scientifically interesting intervals. A proof-of-concept graphical interface has already been developed at the laboratory and is based on the C++ Qt framework. The machine learning learning capabilities will be based to a large extent on existing packages and in collaboration with the laboratory of applied mathematics of Ecole Polytechnique.

h2. Job description

You will be part of the space plasma team at LPP, as a research engineer. You will be in charge of the development of the software SciQLOP. Besides the intuitive graphical interface for visualization and cataloging, you will have to find, propose and implement machine learning solutions allowing SciQLOP to learn and recognize features in the data and suggest them to the user. Collaboration with experts in spacecraft data and instruments in our lab, and machine learning experts in the applied math. lab will be essential for the research and development work. The key points of your development will be:

Deal with CDF files and mission databases

Visualization of multiple format data (time series, images, 2D, 3D)

Upgrade from python to full iPython embedded terminal

Module for cataloging features

Machine learning capabilities

Collaborative/network cataloging

Linux/Windows/Mac portability

Documentation

h2. Location

You will be based at the Laboratory of Plasmas Physics, at "Ecole Polytechnique":http://www.lpp.fr/Comment-venir-au-LPP, in Palaiseau.

!https://hephaistos.lpp.polytechnique.fr/redmine/attachments/download/1094/jpg_plan_lpp_X-81667.jpg!

h2. Salary

gross income 2700€/month depending on the experience and qualifications.

h2. You

Your are motivated for developing graphical user interfaces and are particularly sensitive to their ergonomy. You are enthusiastic about developing a unique scientific application and being one of the precursors of machine learning application for space physics. You're curious, you have a spirit of initiative and an independent worker. You enjoy sharing and team work.

h3. Your experience and formation

h3. Required skills

h3. Desirable skills

Having skills among the following is a great asset:

h2. Contact us/apply

"Nicolas Aunai, Alexis Jeandet":mailto:nicolas.aunai@lpp.polytechnique.fr?cc=alexis.jeandet@lpp.polytechnique.fr


Useful resources

Database facilities

CDF File Format

Known Time Descriptions in CDF Files

Plot and OpenGL in QtCharts

Mission ressources

Types of plots

Cluster

THEMIS

Software

Visualization and analysis toolkits

Algorithms

Web Services

Space Weather

Models

CCMC

Event catalogs

Machine Learning


Job opportunities

You will find here all the past and current job opportunities associated to the SciQLOP project


Known time description

========================

Double Time ranges : date from 01/01/1970 to 01/01/2100


µs, ns, ps might also be needed for tt2000

Double (IEEE754)

64bits

min value: –1.7977E+308

max value :1.7977E+308

Number of seconds per year = 60*60*24*365.25 = 31 557 600

Numbers for 100 years :

3 155 760 000 s = 3.15576 e+9 s

3 155 760 000 000 ms = 3.15576 e+12 ms

3 155 760 000 000 000 µs = 3.15576 e+15 µs

3 155 760 000 000 000 000 ns = 3.15576 e+18 ns

3 155 760 000 000 000 000 000 ps = 3.15576 e+21 ps

Typical dynamics for a double seems to be 15 digits, after that we might
experiment precision loss.

Recommendation is to store time in QLop as microseconds since Epoch
(01-01-1970 00:00:00)

Known time description


|.Mission Name |.time var name |. units |.DEPEND
|.LABLAXIS|.FIELDNAM |.CATDESC |.Type |.VIRTUAL|.nb of
records|_.VAR_NOTES|

|Cluster FGM |time_tags__CDFNAME |ms |0 |UT |Universal Time |Interval
centred time tag |CDF_EPOCH | |normal |field missing|

|Cluster HIA |time_tags__CDFNAME |ms |0 |UT |Center Time |Interval
centred time tag |CDF_EPOCH | |normal |field missing|

||||||||||||

|Themis Efi,SCM |VARNAME_time |sec |TIME |UT |Same as time var
name|UTC, in seconds sinc 01-Jan-1970 00:00:00|CDF_DOUBLE | |normal
|Unleaped seconds|

|Themis Efi,SCM |VARNAME_epoch |field missing|0 |UT |Same as time var
name|Unrelated |CDF_EPOCH |true |0 |field missing|

|Themis Efi,SCM |VARNAME_dot0_epoch0|msec


The SciQLOP project

The project proposal can be found here


h1. Offres d'emploi

Vous trouverez sur cette page les offres d'emploi passées ou courantes liées au projet SciQLOP


h1. Plot and OpenGL in QtCharts

The following test has been used to define the limits of OpenGL :

typedef struct attribute((packed)) dbl_str{
uint64_t mant:52;
uint64_t exp:11;
uint64_t sign:1;
}dbl_str;

typedef union dbl{
double dblval;
uint64_t intval;
dbl_str strval;
}dbl;

QT_CHARTS_USE_NAMESPACE

int main(int argc, char *argv[])
{
QApplication a(argc, argv);

QVector<double> timeVector;
dbl offset;
offset.strval.sign=0;
offset.strval.exp=0b01111111111;
offset.strval.mant=0b10000000000000000000000000000000000000000000000000000;
for(int i = 0; i<(1<<20);i++)
{
    timeVector.append(offset.dblval);
    offset.strval.mant+=1<<(52-24);
}

QLineSeries *seriesOGL = new QLineSeries();
seriesOGL->setUseOpenGL(true);

for(int i=0;i<timeVector.count();i++)
{
    double LUT[]={0.0,1.0,-1.0,2.0,-2.0,3.0};
    seriesOGL->append(timeVector.at(i), LUT[i%6]);
}

Chart *chart = new Chart();
chart->legend()->hide();

chart->addSeries(seriesOGL);

chart->createDefaultAxes();
chart->setTitle("Simple line chart example");

ChartView *chartView = new ChartView(chart);
chartView->setRenderHint(QPainter::Antialiasing);

QMainWindow window;
window.setCentralWidget(chartView);
window.resize(1400, 1300);
window.show();
return a.exec();

}

By changing this line :
offset.strval.mant+=1<<(52-23);
to offset.strval.mant+=1<<(52-24);
we observed that the plot did not take into account any changes in the double mantissa after 28 bits (52-24),
i.e. some points are stacked because the plot cannot make the difference between two different abscissa values.

This corresponds to the size of the float mantissa.
We can then assume that the OpenGL plot uses floats.

We found the lines where the doubles are casted to floats in the QtCharts code.
This takes place in glxyseriesdata.cpp in GLXYSeriesDataManager::setPoints : each x and y of the points are casted to floats.

The new float vector is then used by glwidget.cpp in vertex and fragment source code called by GLWidget::paintGL


SciQlop Status
Priorities : Google doc link
Scientific Objectives and Performances



Functionnality
Status(% done)




SciQLOP should be portable, and an executable/setup should be available for Linux, Mac, Windows
50


Installing SciQLOP must be easy, must not require to read a documentation or install dependencies
50


When opened SciQLOP should by default reload the previous state (data products, plots, plugins, etc.)
0


SciQLOP comes with good user documentation, including galleries, examples and tutorials including video tutorials.
50


SciQLOP should propose the same functionalities on all platforms.
50


SciQLOP should propose the same performances on all platforms.
50


SciQLOP's GUI should remain light, beautiful and intuitive
50


Exploring databases, browsing data should be easier/faster than with other existing softwares
90


SciQLOP should provide efficient and transparent data browsing, access to user python routines, collaborative cataloging features with and without machine learning
0


SciQLOP should remain responsive when plotting 10 millions of points on a standard machine.
90


SciQLOP should have a limited list of data type it can handle and group them by common traits (scalars, vectors, image...).
100


SciQLOP may be able to provide features depending on the data type, source, unit or any condition/property the user may define or program.
50


Each modification of the data between the source (server) and the plot should be documented
?


SciQLOP should be used by students as a formation tool : provide users with knowledge toolkits associated to used quantities
?


SciQLOP should provide users with easy access to documentations on data, missions, instruments, etc.
90


SciQLOP should allow users to access wikipedia plasma/mission-related articles and suggest users to edit/add content.
0


SciQLOP should allow users to save and restore sessions.
0


SciQLOP user session should contain all data to restore its previous state.
0



Code redistribution



Functionnality
Status(% done)




SciQLOP source code will be GPLv2.
100


SciQLOP source folder should contain the files:README, INSTALL, COPYING, CHANGELOG.
0


All SciQLOP source files should contain GPLv2 header.
100


All SciQLOP dependencies should be compatible with GPLv2.
90



Code versioning



Functionnality
Status(% done)




SciQLOP source code modifications should have a link between features or bug corrections and code revisions
50


SciQLOP source code should be hosted on the laboratory server hephaistos1 with the mercurial version system.
100



h2.Code Writing



Functionnality
Status(% done)




SciQLOP developer's documentation, roadmap, issues, etc. will be done on the Redmine application :
https://hephaistos.lpp.polytechnique.fr/redmine/projects/sciqlop
100


Unit test are developed for all modules and performed after each important merge
90


SciQLOP's code is homogeneous in its syntax and philosophy to guarantee easy maintainability.
100


SciQLOP's code is self-explanatory, comments are used to explain goals, methods rather than instructions. Comments record development flow (hacks, todos, etc.)
95


SciQLOP's code is based on the Qt framework and plotting capabilities use the QtCharts API.
100



SciQLop Modules



Functionnality
Status(% done)




Core



Database



All the native data types SciQLOP should be handled
100


Database module should allow to associate commands/functions on data/data types from GUI.
50


Database module should provide a way to view loaded products.
100


Database module GUI should show if a product is in use and by which module(Plot5, plugin2, PythonContext...).
0


Database module GUI should show which plugin provide the data .
0


Origin of the data should be associated with metadata. Ex: if product comes from a file, display the path and file metadata (dates, size, permissions etc.). If from datadownloader, display which one, i.e. amda, cdaweb etc.
0


Data products are accessible through a tree dynamically built with a filter
100


Users can easily access metadata on data products (mission, resolution, unit, etc.) by interacting with the tree
95


Default tree representations are proposed to the user - TBD
?


Caveat and other useful information usually present in headers are available from the database's GUI
0


Database representation should highlight data products when selected on plots
0


Any data imported into the user session, whether it is from a file, or from the Space data Module must be added to the database
0


Garbage collector - TBD
?


Each data product in the database acquires a unique identifier
100


Data products loaded from into the database are not mutable.
100


Users should be able to delete a product from the database through contextual menu
100


Deleting a product from the Database checks whether the product is still in use. If so warns the user that all plots will be updated accordingly
95 don't warn


Multiple occurences of the same dataproducts (i.e. database entry differing by their UID, and possibly time interval) are grouped under the same data product name in the tree and appear by their UID/plot
0


Database also show the currently defined variables in the python session, in a separate tab. Although python variables, like data products, contain data, they differ by being mutable and not associated with rich metadata such as origin, etc
0


Users can manually add a python function to the database in the context menu
0


Data Downloader



Data downloader should be a singleton.
100


Data downloader may implement most protocols.(HTTP, FTP, WebDav...)
?


Data downloader should handle proxy server.
100


Data downloader should know when data asked by the user is locally accessible already.
0


Python Engine



SciQLOP functionalities are scriptable through the ipython terminal.
0


Users can interact with the database through python, he may be able to pull data from database to python or push data from python to database.
0


Users can interact with any plot with python
0


User can grab data from plot or plot data with python
0


Data in the python terminal is implemented as nimpy array and DataFrames objects, with time as index
0


GUI Manager



GUI manager should be a singleton
0


GUI manager may provide an interface to populate menus from any module
0


Catalogs & Community functionalities



Plot



General



SciQLOP should be able to plot all kinds of data relevant to space plasma missions(scalar, vector, tensor, spectrogram
100


Each plot element (context : colorbars, ticks, titles, labels etc. ; data : lines, style, dots, etc.) should be clickable and customizable
40


Controls on figure items should be identical in all kinds of plots (e.g. changing title, labels etc. is done identically for all kinds of plots).
100


A plot “style” (context and data style) can be copy/pasted onto another plot
0


Users may define custom plot styles and save/reload them as desired
0


Plots should provide visualization of available metadata (instrument modes, etc.)
0


Each plot/view should provide the context menu associated to the data type.
50


Each plot/view should allow the user to select a portion of data and perform specific operations on it, such as labelling it, building new plots or applying data analysis methods on it (e.g. get statistical quantities such as mean/std, etc.).
50


Data can be selected by dragging a zone onto the plot: Time interval as rectangle for time series for example. A box in 3d plot/view for example.
0


For time dependent data the plot/view may ask for more data when the user scrolls to the borders of the plot/view.
0


Users should experience no significant lag related to downloading data associated with scrolling windows
0


Each plot/view should allow the integration of custom controls (user defined gui widgets) provided by the current plotted data. The control may be provided as a QWidget.
0


Each plot/view should be easily duplicable.
0


Figures/plots can be linked together or not.
100


Linked plots will zoom/unzoom, scroll together (horizontally)
100


A Lens effect enables the user to zoom on a part of the data without changing the plot range (like a lens effect on Gimp-like software). This can be made by adding a plot window overlapped onto the plot.
0


Any plot should come with a default legend, customizable by the user
100


Any plot can be dragged to a folder and exported as an image (eps, png, jpg)
0


User can select multiple plots holding shift key and drag them onto a folder to export images. SciQLOP ask the user whether multiple images or a summary plot should be done.
0


Mouse wheel can be used to scroll in time when the figure is selected.
100


Plot adjustments controls appear on specific locations on mouse hover. This keeps the interface clear but still customizable.
0


Scalar Time series



Contextual menu should give users rapid access to statistical functionalities such as mean(), std().
0


Data holes are handled in line style mode as a line segment.
100


Line width can be adjusting by clicking+mouse wheel
0


Vector Time series



SciQLOP can plot one or more components of a vector as a function of time
0


SciQLOP is aware the quantity is part of a vector. If Bx is plotted it knows which products represent By and Bz and can easily add them on the plot.
0


The contextual menu on a vector plot should allow users to easily transform their vector into another basis, user-provided and generic basis such as GSE, GSM, etc. or MVA.
0


Vectors displayed on linked plots will be represented in the same basis dynamically (one change of basis on one of the linked plot will change all of them)
0


Vectors can be plotted in 3D or 2D-projection as 3D vectors as a function of time. This representation will be useful for visualizing aspects specific to vectors such as change of direction (e.g. waves & discontinuities) not easily viewable in the time series format.
0


Spectrograms



SciQLOP should enable the same plot/context functionalities for all spectrograms, no matter the nature of the data (particles or waves), this include color range, colormaps, etc.
100


Characteristic frequencies should be overplottable as time series.
0


Using WAMP for theoretical wave frequencies and damping should be possible
0


Plasma eigenmodes visualization toolkit can be called from the spectrograms contextual menu
0


Data holes are handled by showing the background color/texture
100


By clicking on the colorbar, color range can easily be adjusted from cursors appearing on the colorbar on mouse hover
0


Orbits



SciQLOP provides the user with three 2D projections of the magnetosphere/heliosphere and its key regions
0


Heliosphere/Magnetospheric plot use different models (Tsyganenko, shue, Parker Spiral, etc.) easily interchangeable by the user.
0


Orbits of the satellites from which data is currently being viewed appears overplotted on the three 2D magnetospheric/heliosphere projections
0


The position of the spacecraft for the considered intervals must be clearly visible on the trajectory with a colored portion of the orbit, the color being associated to a particular plot (or set of linked plots).
0


Changing the interval range of a spacecraft on its orbit must change the plot time range accordingly and dynamically.
0


SciQLOP should offer an interoperability with CDPP/3DView to view orbits in 3D
0


SpaceData



General



Satellites Data provider should provide an access to CDAWeb data over REST protocol.
100


Satellites Data provider may allow advanced pattern filtering for fast data retrieval
10


Satellites Data provider should allow generic search/filtering among data fields.
100


Satellites Data provider search/filter should accept regular expressions
?


App



Machine Learning



Spectrograms

THEMIS

Particle spectrograms with the ESA Electrostatisc Analyzer have different Modes and Data Products


h1. Testlatex

$\frac{x2}{\sqrt(\cos(x)}$

\begin{equation}
\frac{x2}{\sqrt(\cos(x)}
\end{equation}


Wiki SciQLOP

sciqlop logo

SciQLOP (SCIentific Qt application for Learning from Observations of Plasmas) is an ergonomic and powerful tool enabling visualization and analysis of in situ spacecraft plasma data.

Documentation for Users

You can read the user guide.

Other ressources :

Status of SciQLop Developments

Download

Visit the SciQLOP download page

Build from source

MacOS

brew install qt
brew install meson
vim ~/.bashrc

export PATH=/usr/local/Cellar/qt/5.9.2/bin:$PATH

source ~/.bashrc
mkdir /myPath/SciQLOP
git clone https://hephaistos.lpp.polytechnique.fr/rhodecode/HG_REPOSITORIES/LPP/SciQLOP_Repos/SciQLop /myPath/SciQLOP
/myPath/SciQLOP/build_cfg/mac/build.sh

ScipQLOP App is in /tmp !

Screenshots

Developers documentation

The SciQLOP project