Python Forum
[solved] Matplotlib - histogram with percentages
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[solved] Matplotlib - histogram with percentages
#1
Hi,

I'm still trying to figure out on how to get the percentage on the y axis instead of the number of occurences; i'm not familier with histograms in matplotlib, but for sure i'm missing something!

Thanks for your advices
    # range [0, 5[ = 5 => 71.4 %
    # range [5, 10[ = 1 => 14.3%
    # range [10, 15[ = 0 => 0%
    # range [15, 20[ = 1 => 14.3%
    data = np.asarray([1, 1, 5, 3, 4, 16, 2])
    interval = [0, 5, 10, 15, 20]
    hx, hy, tmp = plt.hist(data, bins = interval, color="red", linestyle='-', linewidth=1, align = 'mid')
    # hx = count
    # hy = interval
    summ = np.sum(hx)
    percentage = (np.asarray(hx) / summ)
    # plt.gca().yaxis.set_major_formatter(PercentFormatter(xmax = 100))
    plt.grid()
    plt.savefig("tmp.png", bbox_inches='tight')
    plt.show()
[Image: tmp.png]';" src="[Image: tmp.png]" alt="İmage" id="maximage" title="Click Photo To Enlarge">
Reply
#2
Hi,

I'm still trying to figure out on how to get the percentage on the y axis instead of the number of occurences; i'm not familier with histograms in matplotlib, but for sure i'm missing something!
(I'll also need to rework the grid size :-)

Thanks for your advices

    # range [0, 5[ = 5 => 71.4 %
    # range [5, 10[ = 1 => 14.3%
    # range [10, 15[ = 0 => 0%
    # range [15, 20[ = 1 => 14.3%
    data = np.asarray([1, 1, 5, 3, 4, 16, 2])
    interval = [0, 5, 10, 15, 20]
    hx, hy, tmp = plt.hist(data, bins = interval, color="red", linestyle='-', linewidth=1, align = 'mid')
    # hx = count
    # hy = interval
    summ = np.sum(hx)
    percentage = (np.asarray(hx) / summ)
    # plt.gca().yaxis.set_major_formatter(PercentFormatter(xmax = 100))
    plt.grid()
    plt.savefig("tmp.png", bbox_inches='tight')
    plt.show()
Reply
#3
ok i got it

    # range [0, 5[ = 5 => 71.4 %
    # range [5, 10[ = 1 => 14.3%
    # range [10, 15[ = 0 => 0%
    # range [15, 20[ = 1 => 14.3%
    data = np.asarray([1, 1, 5, 3, 4, 16, 2])
    interval = [0, 5, 10, 15, 20]
    weights2apply = (np.ones_like(data) / float(len(data))) * 100
    hx, hy, tmp = plt.hist(data, bins = interval, weights = weights2apply, color="red", linestyle='-', linewidth=1, align = 'mid')
    plt.title('histogramme with percentages')
    plt.xticks(interval)
    plt.gca().yaxis.set_major_formatter(PercentFormatter(xmax = 100))
    plt.grid()
    plt.savefig("tmp.png", bbox_inches='tight')
    plt.show()
[Image: tmp.png]
Reply
#4
Answered by XGrok:
Here's the corrected and improved version of your code.

You probably meant to show the **y-axis** as percentages (which is very common for histograms — "relative frequency" or "percentage").
Showing the **x-axis** as a percentage usually doesn't make sense here because the x-axis represents your data values (from 0 to 20).

### Corrected code (y-axis as percentage):

import numpy as np
import matplotlib.pyplot as plt

data = np.asarray([1, 1, 5, 3, 4, 16, 2])
bins = [0, 5, 10, 15, 20]

fig, ax = plt.subplots(figsize=(8, 5))

# Create histogram
n, bin_edges, patches = ax.hist(data, bins=bins, color="red", 
                                edgecolor='black', linewidth=1, align='mid')

# Convert counts to percentages
total = n.sum()
percentages = (n / total) * 100

# Set y-axis to percentage
ax.set_ylabel('Percentage (%)')
ax.set_yticks(np.linspace(0, 100, 6))  # nice ticks: 0, 20, 40, 60, 80, 100

# Optional: add percentage labels on top of each bar
for i, (count, pct) in enumerate(zip(n, percentages)):
    if count > 0:  # only show label if bar exists
        ax.text(bin_edges[i] + (bin_edges[i+1] - bin_edges[i])/2, count + 0.5, 
                f'{pct:.1f}%', ha='center', va='bottom', fontsize=10)

ax.set_xlabel('Value')
ax.set_title('Histogram with Percentage on Y-axis')
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.savefig("histogram_percentage.png", bbox_inches='tight', dpi=200)
plt.show()
### If you really wanted the x-axis in percentage (e.g. cumulative percentage or normalized bins):

Let me know your exact goal and I can adjust it.

But in 95% of cases when people say "show as percentage" with a histogram, they actually want the **heights** (y-axis) to be percentages, which is what the code above does.

Would you like a version with both axes labeled nicely, or a cumulative percentage histogram? Just say the word!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Trying to get counts/sum/percentages from pandas similar to pivot table cubangt 6 4,911 Oct-06-2023, 04:32 PM
Last Post: cubangt
  Unhashable error - Histogram code lsbpython 1 1,995 Aug-07-2022, 04:02 PM
Last Post: Yoriz
  [Solved] Matplotlib - Tricontour: how to set colorbar range ju21878436312 1 11,123 Dec-13-2021, 07:44 PM
Last Post: ju21878436312
  Matplotlib: How do I convert Dates from Excel to use in Matplotlib JaneTan 1 5,096 Mar-11-2021, 10:52 AM
Last Post: buran
  Percentages displayed as "0.00" Winfried 2 2,926 Nov-15-2019, 05:20 PM
Last Post: Winfried
  Help with Plotting Histogram Shimmy 1 50,500 Oct-25-2019, 08:20 AM
Last Post: newbieAuggie2019
  How to plot histogram from 2 arrays? python_newbie09 5 13,261 Mar-28-2019, 04:20 AM
Last Post: scidam
  How to: Plot a 2D histogram from N-dim array? StevenZ 1 3,415 Mar-31-2018, 04:08 AM
Last Post: Larz60+

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020