Python Data Foundations Documentation

A plain documentation-style guide for Python, data handling, visualization, and machine learning basics.

Generated Data and Coordinate Plots

This page shows how to generate random integer and decimal data, create coordinate pairs, plot points, and select dimensions from generated matrices.

What you should be able to do
  • Generate integers with np.random.randint.
  • Generate decimal values with np.random.uniform and np.random.rand.
  • Use list coordinates and matrix columns in scatter plots.
  • Visualize chosen dimensions from a multi-feature array.
Reusable patterns
  • randint(low, high, size) includes low but excludes high.
  • For scatter(x, y), x and y must have the same length.
  • data[:, column] selects one dimension across every row.

Manually defining and generating data

Listing 1. Generate data

# generate data
import numpy as np
a = np.random.randint(10, 20, size = 10) # 10 numbers from 10 to 20; 10 is included, 20 is excluded, and size = 10 is the number of elements
a
Expected text output or note
array([15, 18, 14, 14, 10, 18, 14, 14, 17, 19])

Listing 2. Size does not have to be written explicitly, so this also works

# size does not have to be written explicitly, so this also works:
a = np.random.randint(10, 20, 4)
a
Expected text output or note
array([19, 16, 15, 10])

Listing 3. Generate decimal numbers

# generate decimal numbers
b = np.random.uniform(40, 100, 10)
b
Expected text output or note
array([40.99693929, 89.69422186, 47.25588507, 82.4686031 , 46.8606344 ,
       54.43238276, 76.17707368, 86.98758579, 97.09708864, 65.99166661])
Practice task. generate 20 random integers in the range from 5 to 20.

Listing 4. Code listing 4

c = np.random.randint(5, 21, 20)
c
Expected text output or note
array([18,  8, 19, 12, 16,  8, 17, 11,  7, 15, 13,  5, 17, 19,  7, 12, 17,
       16, 14, 18])

Listing 5. Define points in a coordinate system

# define points in a coordinate system
# each point has two coordinates, x and y, so size must include 2
# size can be omitted; only the dimensions can be given
d = np.random.randint(1, 100, size = (15, 2)) # generates 15 points in the range 1 to 99
d
Expected text output or note
array([[55, 25],
       [58, 71],
       [10, 14],
       [69, 52],
       [40, 49],
       [77, 70],
       [49, 97],
       [77, 17],
       [39,  3],
       [77, 47],
       [18, 25],
       [21, 45],
       [42, 89],
       [ 5, 27],
       [57, 18]])

Listing 6. Generates 50 points in the range 1 to 499 and stores them in data

data = np.random.randint(1, 500, (50, 2)) # generates 50 points in the range 1 to 499 and stores them in data

Listing 7. Generate 20 random decimal points between 0 and 1

# generate 20 random decimal points between 0 and 1
e = np.random.rand(20, 2)
e
Expected text output or note
array([[0.72948888, 0.02337667],
       [0.64381173, 0.89750167],
       [0.60162636, 0.60744674],
       [0.88710748, 0.30296671],
       [0.13830156, 0.98583179],
       [0.62640795, 0.63346163],
       [0.87897929, 0.17759322],
       [0.85258553, 0.8628844 ],
       [0.17033283, 0.07986303],
       [0.83271872, 0.71307192],
       [0.90462015, 0.64996198],
       [0.33093289, 0.41700548],
       [0.50650215, 0.39042263],
       [0.88031627, 0.25150871],
       [0.65651668, 0.13294804],
       [0.00507865, 0.34921312],
       [0.14447559, 0.92991954],
       [0.30586889, 0.01845593],
       [0.39530518, 0.38419802],
       [0.64733883, 0.77887366]])

Setting coordinate values manually

Listing 8. Using python lists

# using Python lists
x1 = [2, 3, 4]
y1 = [5, 5, 5]
# these are 3 points: (2, 5), (3, 5), and (4, 5)
y2 = [6, 8, 9] # if y2 is added with the same x1, this gives three more points: (2, 6), (3, 8), and (4, 9)
# for plt.scatter(x, y), the lengths of x and y must be equal

Listing 9. Visual display

# visual display
import matplotlib.pyplot as plt
plt.scatter(x1, y1)
plt.title("Generated points with coordinates (x1, y1)")
plt.xlabel("Vrijednost varijable x1")
plt.ylabel("Vrijednost varijable y1")
plt.show()
Expected text output or note
<Figure size 640x480 with 1 Axes>

[visual output omitted; run the code to display the image or chart]

Listing 10. Code listing 10

plt.scatter(x1, y2)
plt.title("Generated points with coordinates (x1, y2)")
plt.xlabel("Vrijednost varijable x1")
plt.ylabel("Vrijednost varijable y2")
plt.show()
Expected text output or note
<Figure size 640x480 with 1 Axes>

[visual output omitted; run the code to display the image or chart]

Listing 11. We can generate data as before, but we must keep the same size

# we can generate data as before, but we must keep the same size:
a = np.random.randint(20, 40, 12)
b = np.random.uniform(60, 80, 12)
plt.scatter(a,b)
plt.title('Generated points with coordinates (a, b)')
plt.xlabel('Vrijednost varijable a')
plt.ylabel('Vrijednost varijable b')
plt.show()
Expected text output or note
<Figure size 640x480 with 1 Axes>

[visual output omitted; run the code to display the image or chart]

Listing 12. Code listing 12

x=[3.5,2,2,2,2.5,3,2.5,3,3.5,4,4.5,4,4.5,5,5.5,5,5.5,6,6.5,6,6.5,7,7.5,7,7.5,8,8,8]
y=[7,6,5.5,5,4.5,7,6.5,4,3.5,7,6.5,3,2.5,6,6.5,2,2.5,7,7,3,3.5,7,6.5,4,4.5,6,5.5,5]
plt.scatter(x,y, color = "r")
plt.show()
Expected text output or note
<Figure size 640x480 with 1 Axes>

[visual output omitted; run the code to display the image or chart]

Listing 13. Multiple things in one plot

# multiple things in one plot
g = np.random.randint(10, 40, 10)
r = np.random.randint(20, 30, 10)
a = np.random.randint(30, 40, 10)
p = np.random.randint(40, 50, 10)
plt.scatter(g, r, label = "g, r") # label is the legend text
plt.scatter(a, p, label = "a, p", color = "r", marker = "^")
plt.scatter(a, r, label = "a, r", color = "g", marker = "<")
plt.legend()
plt.title("Data display")
plt.show()
Expected text output or note
<Figure size 640x480 with 1 Axes>

[visual output omitted; run the code to display the image or chart]

Listing 14. Display all points in the data

# display all points in the data
data = np.random.randint(1, 500, (50, 2))
plt.scatter(data[:, 0], data[:, 1]) # data[:, 0] takes all values in the first column, the x coordinates; data[:, 1] takes the second column, the y coordinates
Expected text output or note
<matplotlib.collections.PathCollection at 0x7d8135ce03b0>

<Figure size 640x480 with 1 Axes>

[visual output omitted; run the code to display the image or chart]
Practice task. store 50 integers with 5 dimensions in x5, with values from 20 to 250. Visualize the data so the x-axis uses the 2nd dimension and the y-axis uses the 4th dimension.

Listing 15. Generates 50 rows, examples or points, and 5 columns, dimensions or features

x5 = np.random.randint(20, 250, (50, 5)) # generates 50 rows, examples or points, and 5 columns, dimensions or features
plt.scatter(x5[:, 1], x5[:, 3]) # takes all rows of the second column and all rows of the fourth column
Expected text output or note
<matplotlib.collections.PathCollection at 0x7d8135413680>

<Figure size 640x480 with 1 Axes>

[visual output omitted; run the code to display the image or chart]

Back to overview