A wide variety of packages exist for generating charts and rich (style) tables directly from dataframe type data objects in Python and R.
Creating charts directly from datatables means that the charts re guaranteed to fairly represent the original data, as long as the chart definition is sound.
A variety of statistical charting packages also exist that perform statistical operations on the provided data before then rendering an appropriate statistical chart (examples include histogram charts than bin and count raw data values, for example).
This overview introduces some of the charting approaches that are possibly in a Python environment. Charting tools in R will be covered elsewhere.
seaborn statistical charting package¶
seaborn statistical data visualization package is a Python package that provides a wide range of chart types that can be used to generate static charts from pandas
Conveniently, the package includes sample datasets that can be used to demonstrate the various available chart types.
%%capture try: import seaborn except: %pip install seaborn
import seaborn as sns
View a fragment of a sample dataset:
df_penguins = sns.load_dataset("penguins") df_penguins.head()
As an example of generating a statistical plot directly from the data, consider the construction of a histogram:
chart = sns.histplot(data=df_penguins, x="flipper_length_mm");
The bin widths and counts within bins are handled automatically by the chart based on the provided (raw) data.
In some teaching and learning examples, we may want learners to work through the steps involved in the data processing steps required to create the plot; but in other cases, where we just want to create the chart itself, using the statistical chart type (if we trust it!) is simpler.
We can also save the chart to a file and then load it back in to display it:
fn_sns = "sns_output.png" chart.figure.savefig(fn_sns) from IPython.display import Image Image(fn_sns)
Creating Interactive Charts¶
As well as creating static charts, a wide variety of packages exist that are capable of rendering interactive charts. One such example is the
plotly Python graphing library.
%%capture try: import plotly except: %pip install plotly
seaborn, various sample datasets are included in the package, such as the famous
import plotly.express as px df_iris = px.data.iris() df_iris.head()
Interactive charts can be created by calling the appropriate chart type with the dataframe and specifying which data columns should be used for which chart dimension.
The resulting chart is a fully interactive chart that includes tools for zooming and panning around the chart and interactively saving the current chart view to a file.
iris_plot = px.scatter(df_iris, x="sepal_width", y="sepal_length", color='petal_length') iris_plot # The HTML chart can also be saved to a file #iris_plot.write_html("plotly_demo.html")