plotly scatterplots which show molecule images on hover!

Overview

molplotly

Plotly scatterplots which show molecule images on hovering over the datapoints!

Beautiful :)

Required packages:

âžĄī¸ See example.ipynb for an example :)

📜 Usage

import pandas as pd
import plotly.express as px

import molplotly

# load a DataFrame with smiles
df_esol = pd.read_csv('esol.csv')
df_esol['y_pred'] = df_esol['ESOL predicted log solubility in mols per litre']
df_esol['y_true'] = df_esol['measured log solubility in mols per litre']

# generate a scatter plot
fig = px.scatter(df_esol, x="y_true", y="y_pred")

# add molecules to the plotly graph - returns a Dash app
app = molplotly.add_molecules(fig=fig, 
                            df=df_esol, 
                            smiles_col='smiles', 
                            title_col='Compound ID', 
                            )

# run Dash app inline in notebook (or in an external server)
app.run_server(mode='inline', port=8011, height=1000)

Input parameters

  • fig : plotly.graph_objects.Figure object
    a plotly figure object containing datapoints plotted from df
  • df : pandas.DataFrame object
    a pandas dataframe that contains the data plotted in fig
  • smiles_col : str, optional
    name of the column in df containing the smiles plotted in fig (default 'SMILES')
  • show_img : bool, optional
    whether or not to generate the molecule image in the dash app (default True)
  • title_col : str, optional
    name of the column in df to be used as the title entry in the hover box (default None)
  • show_coords : bool, optional
    whether or not to show the coordinates of the data point in the hover box (default True)
  • caption_cols : list, optional
    list of column names in df to be included in the hover box (default None)
  • condition_col : str, optional
    name of the column in df that is used to color the datapoints in df - necessary when there is discrete conditional coloring (default None)
  • wrap : bool, optional
    whether or not to wrap the title text to multiple lines if the length of the text is too long (default True)
  • wraplen : int, optional
    the threshold length of the title text before wrapping begins - adjust when changing the width of the hover box (default 20)
  • width : int, optional
    the width in pixels of the hover box (default 150)
  • fontfamily : str, optional
    the font family used in the hover box (default 'Arial')
  • fontsize : int, optional
    the font size used in the hover box - the font of the title line is fontsize+2 (default 12)

Output parameters

by default a JupyterDash app is returned which can be run inline in a jupyter notebook or deployed on a server via app.run_server()

Acknowledgements

Features to-add:

  1. Individual styles for each caption (fonts, colors etc)
  2. Some way to save the plot
  3. Highlight points by clicking on them
  4. SVG image generation
Comments
  • Erro with dash 2.3.0

    Erro with dash 2.3.0

    Hello,

    First of all thanks for this great package.

    Today, I got the following error while importing molplotly whereas I previously had no issue: ImportError: cannot import name 'Input' from 'dash'

    As my version of dash was a bit old 1.6 I believe, I upgraded it to version 2.3.0 via pip. The import error disappeared but I had another error when trying to run the server "app.run_server(mode='inline', port=8003, height=800)": AttributeError: ('Read-only: can only be set in the Dash constructor or during init_app()', 'requests_pathname_prefix')

    I have managed to get around by downgrading dash to version 2.0.0 as recommended here https://stackoverflow.com/questions/70908709/jupyterdash-app-run-server-error-using-jupyter-notebook, but there may be something to look into...

    Thanks again

    opened by remseven 4
  • Pip install error

    Pip install error

    Great idea this package! I however ran into an issue during installation: UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 1822: character maps to <undefined> This is certainly due to unusal characters in the readme, since I could install the package locally by editing the readme file. Cheers!

    opened by ArnaudGaudry 4
  • Dependency versions for pip install

    Dependency versions for pip install

    Is there some reason for the very specific version requirements for the dependencies? For the pip install I would loosen this up unless there are any particular known issues.

    opened by kjelljorner 3
  • Setup testing + CI

    Setup testing + CI

    Best not to wait too long to start testing. Here's a basic scaffold to get started.

    Probably shouldn't be merged as is but y'all can push more commits here to get this in shape.

    opened by janosh 3
  • Plotting in a running Dash app

    Plotting in a running Dash app

    Hi! I have a small dash app that I use to explore the molecules present in different samples. It is possible to select the samples of interest and then display a plotly scatter plot generated using the structures. I tried to add the molplotly layer on the scatter plot but no molecules are displayed. Any experience on that? Thanks!

    opened by ArnaudGaudry 2
  • Defining color and markers simultaneously in px.scatter causes issues with hoverbox

    Defining color and markers simultaneously in px.scatter causes issues with hoverbox

    Hi there, thanks for providing a great and easy to use tool!

    This issue is reproducible with the first example in the documentation:

    df_esol['delY'] = df_esol["y_pred"] - df_esol["y_true"]
    fig_scatter = px.scatter(df_esol,
                             x="y_true",
                             y="y_pred",
                             color='delY',
                             marker='Minimum Degree', # <- addition
                             title='ESOL Regression (default plotly)',
                             labels={'y_pred': 'Predicted Solubility',
                                     'y_true': 'Measured Solubility',
                                     'delY': 'ΔY'},
                             width=1200,
                             height=800)
    
    # This adds a dashed line for what a perfect model _should_ predict
    y = df_esol["y_true"].values
    fig_scatter.add_shape(
        type="line", line=dict(dash='dash'),
        x0=y.min(), y0=y.min(),
        x1=y.max(), y1=y.max()
    )
    
    fig_scatter.update_layout(title='ESOL Regression (with add_molecules!)')
    
    app_scatter = molplotly.add_molecules(fig=fig_scatter,
                                          df=df_esol,
                                          smiles_col='smiles',
                                          title_col='Compound ID',
                                          color_col='delY' # <- addition
                                          )
    
    # change the arguments here to run the dash app on an external server and/or change the size of the app!
    app_scatter.run_server(mode='inline', port=8001, height=1000)
    
    

    This returns

    ---------------------------------------------------------------------------
    IndexError                                Traceback (most recent call last)
    ~/anaconda3/envs/ml/lib/python3.7/site-packages/molplotly/main.py in display_hover(
        hoverData={'points': [{'bbox': {'x0': 948.39, 'x1': 950.39, 'y0': 177.7, 'y1': 179.7}, 'curveNumber': 0, 'marker.color': -0.48000000000000004, 'pointIndex': 960, 'pointNumber': 960, 'x': 0.79, 'y': 0.31}]}
    )
        111             df_curve = df[df[color_col] ==
        112                           curve_dict[curve_num]].reset_index(drop=True)
    --> 113             df_row = df_curve.iloc[num]
            df_row = undefined
            df_curve.iloc = <pandas.core.indexing._iLocIndexer object at 0x7f7e3d16c950>
            num = 960
        114         else:
        115             df_row = df.iloc[num]
    
    ~/anaconda3/envs/ml/lib/python3.7/site-packages/pandas/core/indexing.py in __getitem__(
        self=<pandas.core.indexing._iLocIndexer object>,
        key=960
    )
        929 
        930             maybe_callable = com.apply_if_callable(key, self.obj)
    --> 931             return self._getitem_axis(maybe_callable, axis=axis)
            self._getitem_axis = <bound method _iLocIndexer._getitem_axis of <pandas.core.indexing._iLocIndexer object at 0x7f7e3d490c50>>
            maybe_callable = 960
            axis = 0
        932 
        933     def _is_scalar_access(self, key: tuple):
    
    ~/anaconda3/envs/ml/lib/python3.7/site-packages/pandas/core/indexing.py in _getitem_axis(
        self=<pandas.core.indexing._iLocIndexer object>,
        key=960,
        axis=0
    )
       1564 
       1565             # validate the location
    -> 1566             self._validate_integer(key, axis)
            self._validate_integer = <bound method _iLocIndexer._validate_integer of <pandas.core.indexing._iLocIndexer object at 0x7f7e3d490c50>>
            key = 960
            axis = 0
       1567 
       1568             return self.obj._ixs(key, axis=axis)
    
    ~/anaconda3/envs/ml/lib/python3.7/site-packages/pandas/core/indexing.py in _validate_integer(
        self=<pandas.core.indexing._iLocIndexer object>,
        key=960,
        axis=0
    )
       1498         len_axis = len(self.obj._get_axis(axis))
       1499         if key >= len_axis or key < -len_axis:
    -> 1500             raise IndexError("single positional indexer is out-of-bounds")
            global IndexError = undefined
       1501 
       1502     # -------------------------------------------------------------------
    
    IndexError: single positional indexer is out-of-bounds
    

    Using either only marker or color alone causes no issues with the hoverbox. Also, using Minimum Degree as color_col for add_molecules when both color and symbol are defined gives no issues.

    opened by chertianser 2
  • Removed support for python 3.7

    Removed support for python 3.7

    Thanks for the great library, really useful!

    You pinned the pandas version to ~=1.4.1 which effectively cuts of users that use python<3.7. See the pandas release notes: https://pandas.pydata.org/docs/whatsnew/v1.4.0.html

    Is this intentional? Which novel features from pandas >1.4.0 are strictly necessary to keep the package running? I'll create a PR with a relaxed pandas requirements that works fine for me in a python3.7 env.

    opened by jannisborn 1
  • Input params as table

    Input params as table

    I find list of parameters easier to parse in a table than as list. Here's what that would look like. If you think it's an improvement, could consider joining type and default columns to save horizontal space.

    After

    Screen Shot 2022-03-02 at 09 37 16

    Before

    Screen Shot 2022-03-02 at 09 39 13

    opened by janosh 1
  • Rokas slider

    Rokas slider

    Implemented support for specifying multiple smiles in the smiles_col argument for molplotly.add_molecules:

    • When a single str argument is passed to smiles_col, the function behaves as before.
    • When a list is passed, a slider is created under the plot, which allows the user to decide which column to use to render molecules.

    Also changed the ports in the example.ipynb to start from 8700 and go up. On my system 8000 and 8001 were reserverd already.

    opened by RokasEl 1
  • Error when using with Plotly Subplots

    Error when using with Plotly Subplots

    When trying to use molplotly to generate hover structures with a series of scatterplots generated using make_subplots (generated using different columns of a dataframe for the same RDKit molecule row), molplotly.add_molecules returns

    ValueError: More than one plotly curve in figure - color_col and/or marker_col needs to be specified.

    As these plots are generated using different columns, rather than faceting data in a single column based on values in another, there is no common color or marker column. Is there a way to generate molecular structures for these subplots?

    opened by matthewtoholland 3
  • Matrix distance to scatter plot

    Matrix distance to scatter plot

    Hello everyone,

    My name is Judith and for my PhD studies, I would like to use your beautiful scripts. I get a distance matrix by rmsd between each pose but I don't see how to pass it to a scatter plot of 2 clusters, I tried with pandas but I'm really blocked, I can't select the lines and the columns to generate the scatter plot

    Best Regards, Judith

    opened by JudKil 7
  • Saving interactive plots

    Saving interactive plots

    Thanks for the great package!

    It would be fantastic if the interactive plots could be exported/saved. I understand that this is non-trvial in plotly, but other libraries like mpl3d also allow to export as interactive HTML or SVG. See here for an exemplary plot. Also TMAP and Faerun support this natively. I think it will be a heavily sought-after feature for real usability of this package.

    Possible solutions:

    • separate integration building on top of mpl3d (seems overkill, might be the last resort)
    • Building upon this gist to export the Dash as HTML: https://gist.github.com/exzhawk/33e5dcfc8859e3b6ff4e5269b1ba0ba4
    • Faerun-style solution, see here: https://github.com/reymond-group/faerun-python
    opened by jannisborn 2
Releases(v1.1.5)
  • v1.1.5(Nov 24, 2022)

    Added ability to plot 3D coordinates from RDKit Mol objects as highlighted in issue #20, as well as making facet plots as raised in issue #21.

    Source code(tar.gz)
    Source code(zip)
  • v1.1.4(Jun 14, 2022)

    Loosened package dependencies to address issue #18. Minimum version requirements for dash, jupyter-dash, and werkzeug are specified but everything else e.g. pandas is loosened.

    Source code(tar.gz)
    Source code(zip)
  • v1.1.3(Jun 1, 2022)

  • v1.1.2(Apr 6, 2022)

  • v1.1.1(Mar 1, 2022)

  • v1.1.0(Mar 1, 2022)

    Added features, formatting, and bug fixes :)

    • Simultaneous plotting of multiple smiles columns (pull request #1) can now be done by passing in a list of smiles columns into smiles_col (see examples/multiple_smiles_columns.ipynb for a tutorial).
    • Adjusting of hover box transparency (issue #3) can now be controlled with alpha and mol_alpha arguments (see entry in examples/simple_usage_and_formatting.ipynb for example usage).
    • Usage examples split into multiple notebooks and organised in examples folder.
    • Fixed bug (issue #6) resulting from specifying both color_col and marker_col.
    Source code(tar.gz)
    Source code(zip)
  • v1.0.1(Feb 11, 2022)

    It seems that Dash got updated to 2.1.0 without me realising and of course that broke jupyter-dash 😅 this is a hotfix to the requirements specifying that dash 2.0.0 is required.

    Source code(tar.gz)
    Source code(zip)
  • v1.0.0(Feb 11, 2022)

Create Badges with stats of Scratch User, Project and Studio. Use those badges in Github readmes, etc.

Scratch-Stats-Badge Create customized Badges with stats of Scratch User, Studio or Project. Use those badges in Github readmes, etc. Examples Document

Siddhesh Chavan 5 Aug 28, 2022
Matplotlib colormaps from the yt project !

cmyt Matplotlib colormaps from the yt project ! Colormaps overview The following colormaps, as well as their respective reversed (*_r) versions are av

The yt project 5 Sep 16, 2022
Tidy data structures, summaries, and visualisations for missing data

naniar naniar provides principled, tidy ways to summarise, visualise, and manipulate missing data with minimal deviations from the workflows in ggplot

Nicholas Tierney 611 Dec 22, 2022
Splore - a simple graphical interface for scrolling through and exploring data sets of molecules

Scroll through and exPLORE molecule sets The splore framework aims to offer a si

3 Jun 18, 2022
a plottling library for python, based on D3

Hello August 2013 Hello! Maybe you're looking for a nice Python interface to build interactive, javascript based plots that look as nice as all those

Mike Dewar 1.4k Dec 28, 2022
Custom Plotly Dash components based on Mantine React Components library

Dash Mantine Components Dash Mantine Components is a Dash component library based on Mantine React Components Library. It makes it easier to create go

Snehil Vijay 239 Jan 08, 2023
CompleX Group Interactions (XGI) provides an ecosystem for the analysis and representation of complex systems with group interactions.

XGI CompleX Group Interactions (XGI) is a Python package for the representation, manipulation, and study of the structure, dynamics, and functions of

Complex Group Interactions 67 Dec 28, 2022
python partial dependence plot toolbox

PDPbox python partial dependence plot toolbox Motivation This repository is inspired by ICEbox. The goal is to visualize the impact of certain feature

Li Jiangchun 723 Jan 07, 2023
This is a Web scraping project using BeautifulSoup and Python to scrape basic information of all the Test matches played till Jan 2022.

Scraping-test-matches-data This is a Web scraping project using BeautifulSoup and Python to scrape basic information of all the Test matches played ti

Souradeep Banerjee 4 Oct 10, 2022
Profile and test to gain insights into the performance of your beautiful Python code

Profile and test to gain insights into the performance of your beautiful Python code View Demo - Report Bug - Request Feature QuickPotato in a nutshel

Joey Hendricks 138 Dec 06, 2022
Package managers visualization

Software Galaxies This repository combines visualizations of major software package managers. All visualizations are available here: http://anvaka.git

Andrei Kashcha 1.4k Dec 22, 2022
Fast 1D and 2D histogram functions in Python

About Sometimes you just want to compute simple 1D or 2D histograms with regular bins. Fast. No nonsense. Numpy's histogram functions are versatile, a

Thomas Robitaille 237 Dec 18, 2022
Generate "Jupiter" plots for circular genomes

jupiter Generate "Jupiter" plots for circular genomes Description Python scripts to generate plots from ViennaRNA output. Written in "pidgin" python w

Robert Edgar 2 Nov 29, 2021
PyPassword is a simple follow up to PyPassphrase

PyPassword PyPassword is a simple follow up to PyPassphrase. After finishing that project it occured to me that while some may wish to use that option

Scotty 2 Jan 22, 2022
Visualizations for machine learning datasets

Introduction The facets project contains two visualizations for understanding and analyzing machine learning datasets: Facets Overview and Facets Dive

PAIR code 7.1k Jan 07, 2023
Dimensionality reduction in very large datasets using Siamese Networks

ivis Implementation of the ivis algorithm as described in the paper Structure-preserving visualisation of high dimensional single-cell datasets. Ivis

beringresearch 284 Jan 01, 2023
đŸŒ€â„ī¸đŸŒŠī¸ This repository contains some examples for creating 2d and 3d weather plots using matplotlib and cartopy libraries in python3.

Weather-Plotting 🌀 â„ī¸ đŸŒŠī¸ This repository contains some examples for creating 2d and 3d weather plots using matplotlib and cartopy libraries in pytho

Giannis Dravilas 21 Dec 10, 2022
A concise grammar of interactive graphics, built on Vega.

Vega-Lite Vega-Lite provides a higher-level grammar for visual analysis that generates complete Vega specifications. You can find more details, docume

Vega 4k Jan 08, 2023
Some problems of SSLC ( High School ) before outputs and after outputs

Some problems of SSLC ( High School ) before outputs and after outputs 1] A Python program and its output (output1) while running the program is given

Fayas Noushad 3 Dec 01, 2021
Parallel t-SNE implementation with Python and Torch wrappers.

Multicore t-SNE This is a multicore modification of Barnes-Hut t-SNE by L. Van der Maaten with python and Torch CFFI-based wrappers. This code also wo

Dmitry Ulyanov 1.7k Jan 09, 2023