Loading...
Please wait while we prepare your content
Please wait while we prepare your content
Solutions for Informatics Practices, Class 12, CBSE
Assertion. The matplotlib library of Python is used for data visualization.
Reason. The PyPlot interface of matplotlib library is used for 2D plotting.
Both A and R are true and R is the correct explanation of A.
Explanation
The Matplotlib library is used for data visualization in Python, providing a variety of tools and functionalities for creating different types of plots, charts, and graphs. The Pyplot module, which is a collection of methods within the Matplotlib library, allows users to construct 2D plots easily and interactively.
Assertion. A scatter chart simply plots the data points on a chart to show the trend in the data.
Reason. A line chart connects the plotted data points with a line.
Both A and R are true but R is not the correct explanation of A.
Explanation
The scatter chart is a graph of plotted points on two axes that show the relationship between two sets of data. On the other hand, a line chart, or line graph, is a type of chart that displays information as a series of data points called 'markers' connected by straight line segments.
Assertion. Both scatter() and plot() functions of PyPlot can create scatter charts.
Reason. The plot() function can create line charts as well as scatter charts.
Both A and R are true and R is the correct explanation of A.
Explanation
Both the scatter()
and plot()
functions in Pyplot can create scatter charts. The plot()
function in Pyplot can create both line charts and scatter charts. When specifying marker styles without providing a linestyle argument, the plot()
function will create a scatter chart.
Assertion. For the same sets of data, you can create various charts using plot(), scatter(), pie(), bar() and barh().
Reason. All the data sets of a plot(), scatter(), bar() cannot be used by pie() ; it will work with only a single set of data.
A is false but R is true.
Explanation
We can create various charts using plot()
, scatter()
, bar()
, and barh()
for the same datasets, but not using pie()
. The pie()
function specifically works with a single set of data, whereas the other functions can handle multiple datasets or series.
Assertion. Five-point statistical summary of a data set can be visually represented.
Reason. The boxplot() function can plot the highest and lowest numbers of a data range, its median along with the upper and lower quartiles.
Both A and R are true and R is the correct explanation of A.
Explanation
The five-point statistical summary of a dataset can be visually represented through a box plot. A box plot is used to display the range and middle half of ranked data. It uses five important numbers from the data range: the extremes (highest and lowest numbers), the median, and the upper and lower quartiles, comprising the five-number statistical summary.
Assertion. Line graph is a tool for comparison and is created by plotting a series of several points and connecting them with a straight line.
Reason. You should never use a line chart when the chart is in a continuous data set.
A is true but R is false.
Explanation
A line graph is a tool for comparison, created by plotting a series of data points called 'markers' and connecting them with straight lines. This makes it easier to compare different data points and observe patterns. Line charts are suitable for continuous data sets, displaying information as a series of data and not restricted to discontinuous data sets.
(i) For changing the width of the line, we use the linewidth argument with the plot() function as: <matplotlib.pyplot>.plot(<data1>, [,data2], linewidth = <width> )
(ii) For changing the color of the line, we use the color argument with the plot() function as: <matplotlib.pyplot>.plot(<data1>, [,data2], <color code>)
The data points being plotted on a graph/chart are called markers. To change the marker type and color, we use following additional optional arguments in plot function : marker = <valid marker type>, markeredgecolor = <valid color>
.
No, bar charts and histograms are not same. A bar chart or bar graph is a chart or graph that presents categorical data with rectangular bars with heights or lengths proportional to the values that they represent. On the other hand, a histogram is a type of graph that provides a visual interpretation of numerical data by indicating the number of data points that lie within a range of values.
pie()
Reason — The pie()
function in Matplotlib's Pyplot module can plot only one data sequence. On the other hand, functions like plot()
, bar()
, and barh()
can plot multiple data series in a single chart.
line
Reason — A line chart or line graph is a type of chart which displays information as a series of data points called "markers" connected by straight line segments.
scatter
Reason — A scatter chart is a graph of plotted points on two axes that show the relationship between two sets of data.
histogram
Reason — A histogram is a summarization tool for discrete or continuous data. A histogram provides a visual interpretation of numerical data by indicating the number of data points that lie within a range of values.
box plot
Reason — A box plot provides a visual representation of the statistical five-number summary of a given dataset. It includes the highest and lowest numbers, the median, and the upper and lower quartiles.
line
Reason — A line chart or line graph is a type of chart which displays information as a series of data points called "markers" connected by straight line segments.
linestyle
Reason — When creating scatter charts using Matplotlib's plot()
function, the linestyle
argument is skipped because scatter plots do not use line styles.
width
Reason — The width
argument with a float value is used to change the width of bars in a bar chart created using the bar()
function in Matplotlib.
histtype
Reason — The histtype
argument in Matplotlib's hist()
function is used to create a stacked bar type histogram. Setting histtype = 'barstacked'
creates a histogram where bars for each bin are stacked on top of each other, representing different categories or subgroups within the data.
pie()
Reason — The pie()
function in Matplotlib's Pyplot module can plot only one data series. On the other hand, functions like plot()
, bar()
, and boxplot()
can plot multiple data series in a single chart.
patch_artist
Reason — The patch_artist
argument in the boxplot()
function is used to create a filled box plot. When set to True, it fills the boxes of the box plot with a color, making them more visually distinct.
False
Reason — It is possible to plot multiple series of values in the same bar graph using Matplotlib's Pyplot library because the bar()
and barh()
functions support handling multiple datasets.
True
Reason — When both the linestyle
argument and the marker
argument (markerstyle-string) are not specified in the plot()
function, the resulting chart can resemble a scatter plot. In this case, the points will be plotted without connected lines, similar to how a scatter plot displays data points.
False
Reason — The plot appearance can be affected by the data series being plotted, but it can also be influenced by functions such as xlim()
which determine the range of values shown on the x-axis.
What is not true about Data Visualization ?
(a) Graphical representation of information and data.
(b) Helps users in analyzing a large amount of data in a simpler way.
(c) Data Visualization makes complex data more accessible, understandable, and usable.
(d) No library needs to be imported to create charts in Python language.
No library needs to be imported to create charts in Python language.
Reason — To create charts and visualizations in Python, we need to import libraries such as Matplotlib.
(a) line chart: plot()
function
(b) bar chart: bar()
function
(c) horizontal bar chart: barh()
function
(d) histogram: hist()
function
(e) scatter chart: scatter()
function
(j) boxplot: boxplot()
function
(g) pie chart: pie()
function
The scatter chart is a graph of plotted points on two axes that show the relationship between two sets of data. With a scatter plot, a mark or marker (usually a dot or small circle), represents a single data point. With one mark (point) for every data point a visual distribution of the data can be seen. Depending on how tightly the points cluster together, we may be able to discern a clear trend in the data.
The difference is that with a scatter plot, the decision is made from the data points such that the individual points should not be connected directly together with a line but, instead express a trend.
A bar graph / bar chart is a graphical display of data using bars of different heights.
Compared to a line chart, which connects data points with lines, a bar chart is useful for comparing discrete categories rather than showing continuous trends over time. Bar charts are effective for highlighting differences in values between categories and are particularly useful when dealing with categorical data or comparing data across different groups or time periods.
A histogram is a summarization tool for discrete or continuous data. A histogram provides a visual interpretation of numerical data by indicating the number of data points that lie within a range of values (called "bins"). Histograms are a great way to show results of continuous data, such as: weight, height, how much time, and so forth.
A boxplot is a graphical representation of the distribution of a dataset through five summary statistics: the extremes (the highest and the lowest numbers), the median, and the upper and lower quartiles.
Box plots are suitable for visualizing the spread of data, identifying outliers, comparing data distribution between different groups or categories, and assessing symmetry in a dataset.
A frequency polygon is a type of frequency distribution graph. In a frequency polygon, the number of observations is marked with a single point at the midpoint of an interval. A straight line then connects each set of points. Frequency polygons make it easy to compare two or more distributions on the same set of axes.
Patterns, trends and correlations that might go undetected in text-based data can be exposed and recognized easier with data visualization techniques or tools such as line chart, bar chart, pie chart, histogram, scatter chart etc. Thus with data visualization tools, information can be processed in efficient manner and hence better decisions can be made.
Python supports data visualizations by providing some useful libraries for visualization. Most commonly used data visaulization library is matplotlib. Matplotlib is a Python library, also sometimes known as the plotting library. The matplotlib library offers very extensive range of 2D plot types and output formats. It offers complete 2D support along with limited 3D graphic support. It is useful in producing publication quality figures in interactive environment across platforms. It can also be used for animations as well. There are many other libraries of Python that can be used for data Visualization but matplotlib is very popular for 2D plotting.
For data visualization in Python, the matplotlib library's Pyplot interface is used. Matplotlib is a Python library that provides interfaces and functionalities for 2D graphics, similar to MATLAB's in various forms. It offers both a quick way to visualize data in Python and creates publication-quality figures in many formats. The Matplotlib library offers various named collections of methods. Pyplot, as one such interface, enables users to construct 2D plots easily and interactively.
bar() function | barh() function |
---|---|
This function is used to create vertical bar charts. | This function is used to create horizontal bar charts. |
In a vertical bar chart, the bars are plotted along the vertical axis (y-axis) with their lengths representing the values being plotted. | In a horizontal bar chart, the bars are plotted along the horizontal axis (x-axis) with their lengths representing the values being plotted. |
The first sequence given in the bar() forms the x-axis and the second sequence values are plotted on y-axis. | The first sequence given in the barh() forms the y-axis and the second sequence values are plotted on x-axis. |
In a chart/graph, there may be multiple datasets plotted. To distinguish among various datasets plotted in the same chart, legends are used. Legends can be different colors/patterns assigned to different specific datasets. The legends are shown in a corner of a chart/graph.
Using legend()
function without labels results in default labels (e.g., "line 1," "line 2"). This can confuse viewers as it lacks meaningful information about the data series being plotted.
The xlimit and ylimit determine which data values are visible on the x-axis and y-axis in a plot or chart respectively. Only the data values that fall within these limits will be plotted. If no data value maps to the specified x-limits or y-limits, nothing will show on the plot for that particular axis range.
(i) Line Chart — Use a line chart to show trends or changes over time. It's suitable for displaying continuous data series and highlighting patterns or fluctuations.
(ii) Bar Chart — Use a bar chart to compare categories or groups. It's effective for displaying discrete data and showing differences or relationships between items.
(iii) Scatter Chart — Use a scatter chart to visualize relationships between two variables. It's helpful for identifying correlations or trends in data points.
(iv) Pie Chart — Use a pie chart to represent parts of a whole. It's useful for showing the proportion or distribution of different categories within a dataset.
(v) Boxplot — The box plot is used to show the range and the middle half of ranked data while identifying outliers or variability.
A line chart is the suitable choice for visualizing how the temperature changed over the last seven days. The line chart shows trends over time and displays continuous data, making it ideal for representing temperature values. The chart's ability to connect data points allows viewers to easily observe temperature trends and understand variations across the seven-day period.
A histogram is a summarization tool for discrete or continuous data, providing a visual interpretation of numerical data by showing the number of data points that fall within a specified range of values.
The hist()
function of the Pyplot module is used to create and plot a histogram from a given sequence of numbers. The syntax for using the hist()
function in Pyplot is as follows:
matplotlib.pyplot.hist(x, bins = None, cumulative = False, histtype = 'bar', align = 'mid', orientation = 'vertical', )
.
The hist()
function in Matplotlib's Pyplot module allows creating various types of histograms. These include the default bar histogram (histtype='bar'), step histogram (histtype='step'), stepfilled histogram (histtype='stepfilled'), barstacked histogram (histtype='barstacked').
Histograms are great for displaying specific ranges of values and are ideal for visualizing the results of continuous data, such as the ages of students in a class. Bar charts, on the other hand, are effective for comparing categorical or discrete data across different categories or groups, such as comparing the sales performance of different products.
A cumulative histogram is a graphical representation in which each bin displays the count of data points within that bin as well as the counts of all smaller bins. The final bin in this histogram indicates the total number of data points in the dataset.
In Matplotlib's hist function, we can create a cumulative histogram by setting the cumulative
parameter to True. The syntax is as follows: matplotlib.pyplot.hist(x, bins = None, histtype='barstacked', cumulative=True)
.
A frequency polygon is a type of frequency distribution graph. In a frequency polygon, the number of observations is marked with a single point at the midpoint of an interval, and a straight line then connects each set of points.
We can create frequency polygon in following two ways:
The five-point summary is a descriptive statistics tool that provides a concise summary of the distribution of a dataset. It consists of five important numbers of a data range:
A boxplot is a visual representation of the statistical five number summary of a given data set, including the extremes (the highest and the lowest numbers), the median, the upper and lower quartiles.
With Pyplot, a boxplot is created using boxplot() function. The syntax is as follows : matplotlib.pyplot.boxplot(x, notch = None, vert = None, meanline = None, showmeans = None, showbox = None,)
.
Executing the provided code will not produce an error. It will generate a plot of the logarithm of A against A itself.
The line A = np.arange(2, 20, 2)
creates an array A
using NumPy's arange()
function. It starts from 2, increments by 2, and includes values up to 20. This results in the array [2, 4, 6, 8, 10, 12, 14, 16, 18]. Next, the line B = np.log(A)
calculates the natural logarithm of each element in array A
using NumPy's log()
function and stores the results in array B
. Finally, plt.plot(A, B)
plots the values in array A
along the x-axis and the corresponding values in array B
along the y-axis using Matplotlib's plot()
function.
Executing the provided code will not produce an error. However, the resulting plot might not be as expected because the x-axis values are discrete and categorical, not continuous.
The line A = np.arange(2, 20, 2)
creates an array A
using NumPy's arange()
function. It starts from 2, increments by 2, and includes values up to 20. This results in the array [2, 4, 6, 8, 10, 12, 14, 16, 18]. Next, the line B = np.log(A)
calculates the natural logarithm of each element in array A
using NumPy's log()
function and stores the results in array B
. Finally, plt.bar(A, B)
creates a bar plot using Matplotlib's bar()
function. It plots the values in array A
along the x-axis and the corresponding values in array B
along the y-axis.
The code will produce an error because the variable Y
is not defined.
The corrected code is:
X = np.arange(1, 18, 2.655)
B = np.log(X)
plt.scatter(X, B)
The line X = np.arange(1, 18, 2.655)
creates an array X
using NumPy's arange()
function. It starts from 1, increments by 2.655, and generates values less than 18. The resulting array will look like [1., 3.655, 6.31, 8.965, 11.62, 14.275, 16.93]. Next, the line B = np.log(X)
calculates the natural logarithm of each element in array X
using NumPy's log()
function. Finally, the line plt.scatter(X, Y)
attempts to use Matplotlib's scatter()
function to create a scatter plot. However, Y
is not defined in code, leading to a NameError.
This code snippet uses Matplotlib to create a bar chart. The list Months
contains the names of the months ['Dec', 'Jan', 'Feb', 'Mar'], while the list Attendance
holds corresponding attendance values [70, 90, 75, 95]. The plt.bar()
function is then used to create a bar plot, where each bar represents a month and its height corresponds to the attendance value. Finally, plt.show()
is called to display the plot.
import numpy as np
import matplotlib.pyplot as plt
A = np.arange(2, 20, 2)
B = np.log(A)
C = np.log(A) * 1.2
plt.plot(A, B)
plt.plot(A, C)
plt.show()
import numpy as np
import matplotlib.pyplot as plt
A = np.arange(2, 20, 2)
B = np.log(A)
C = np.log(A) * (-1.2)
plt.plot(A, B)
plt.plot(A, C)
plt.show()
import matplotlib.pyplot as plt
hobbies = ['Dance', 'Music', 'Painting', 'Playing Sports']
people_count = [300, 400, 100, 500]
plt.bar(hobbies, people_count)
plt.xlabel('Hobbies')
plt.ylabel('Number of People')
plt.title('Favourite Hobby')
plt.savefig('favourite_hobby_chart.png')
plt.show()
import matplotlib.pyplot as plt
import numpy as np
Matches = ['Match 1', 'Match 2', 'Match 3', 'Match 4', 'Match 5']
Team_A = [150, 160, 170, 180, 190]
Team_B = [140, 150, 160, 170, 180]
Team_C = [130, 140, 150, 160, 170]
Team_D = [120, 130, 140, 150, 160]
X = np.arange(len(Matches))
plt.bar(Matches, Team_A, width = 0.15)
plt.bar(X + 0.15, Team_B, width = 0.15)
plt.bar(X + 0.30, Team_C, width = 0.15)
plt.bar(X + 0.45, Team_D, width = 0.15)
plt.xlabel('Matches')
plt.ylabel('Scores')
plt.title('IPL Scores')
plt.legend()
plt.show()
import matplotlib.pyplot as plt
months = ['January', 'February', 'March']
prices_stock_A = [100, 120, 110]
prices_stock_B = [90, 110, 100]
prices_stock_C = [95, 115, 105]
plt.plot(months, prices_stock_A, label='Stock A', marker='o')
plt.plot(months, prices_stock_B, label='Stock B', marker='s')
plt.plot(months, prices_stock_C, label='Stock C', marker='^')
plt.xlabel('Months')
plt.ylabel('Prices')
plt.title('Stock Prices Variation')
plt.legend()
plt.grid(True)
plt.show()
import numpy as np
import matplotlib.pyplot as plt
weights = [78, 72, 69, 81, 63, 67, 65, 75, 79, 74, 71, 83, 71, 79, 80, 69]
random_array = np.arange(16)
plt.hist(weights, orientation = 'horizontal')
plt.hist(random_array, orientation = 'horizontal')
plt.title('Horizontal Histograms')
plt.show()
Out of the above plotted histograms, none can be used for creating frequency polygons. We cannot draw frequency polygons from all the above histograms because to construct a frequency polygon, we need a step-type histogram. A frequency polygon is constructed by connecting the midpoints of the tops of the bars of a histogram. Step-type histograms provide a clear outline to draw these connections.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
weights = [78, 72, 69, 81, 63, 67, 65, 75, 79, 74, 71, 83, 71, 79, 80, 69]
plt.figure(figsize = (10, 5))
n, edges, p = plt.hist(weights, bins = 40, histtype = 'step')
m = 0.5 * (edges[1:] + edges[:-1])
m = m.tolist()
l = len(m)
m.insert(0, m[0] - 10)
m.append(m[l-1] + 10)
n = n.tolist()
n.insert(0, 0)
n.append(0)
plt.plot(m, n, '-^')
plt.show()