Previously, we looked at a version of linear regression with only one feature, today, we're going to explore linear regression with more different features
Size in sqft (X1) | Number of Bedrooms (X2) | Number of floors (X3) | Age of home in years (X4) | Price ($) in $1000s |
2104 | 5 | 1 | 45 | 460 |
1416 | 3 | 2 | 40 | 232 |
1534 | 3 | 2 | 30 | 315 |
852 | 2 | 1 | 36 | 178 |
a few notations:
X1, X2, X3, X4 denote 4 input features
Xj represent the list of features (j = 1...4)
We'll also use n to denote the number of training examples
X_i = features of i-th example
X_2 = [1416, 3, 2, 40] # a row vector
Xj_i = value of feature j in the i-th training example, X3_2 = 2
Now that we have multiple features, we're going to define our model function differently:
We can define W as a list of numbers that list the parameters:
In math, this is called a vector and sometimes to designate that this is a vector, which means a list of numbers, we add the arrow on top of it.
So, we can re-write our model function as follow, note that b doesn't have an arrow on top of it as bias is a constant.
which will be calculated as shown in the first example above.
Vectorization
When implementing a learning algorithm, using vectorization will both make our code shorter and make it run much more efficiently.
With vectorization, we can easily implement functions with many input features, which we can implement with NumPy's dot function:
f = np.dot(w, x) + b
The numpy dot function is a vectorized implementation of the dot product operation between 2 vectors, the reason that vectorization implementation is much faster is because the numpy dot function is able to use parallel hardware in your computer.
Take a look at a comparison image below:
Matrices
Matrices are 2-dimensional arrays. the elements of the matrix are all of the same type
NumPy's basic data structure is an indexable, n-dimensional array containing elements of the same time (dtype). Matrices have a 2-dimensional index [m, n]
Matrix creation
The same functions that created 1-D vectors will create 2-D arrays.
Below, the shape tuple is provided to achieve a 2-D result. Notice how NumPy uses brackets to denote each dimension. Notice further that NumPy, when printing, will print one row per line.
a = np.zeros((1, 5))
print(f"a shape = {a.shape}, a = {a}")
# result:
a shape = (1, 5), a = [[0. 0. 0. 0. 0.]]
a = np.zeros((2, 1))
print(f" a shape = {a.shape}, a = {a}")
# result:
a shape = (2, 1), a = [[0.]
[0.]]
a = np.random.random_sample((1, 1))
print(f"a shape = {a.shape}, a = {a}")
# result:
a shape = (1, 1), a = [[0.44236513]]
Indexing
Matrices include a second index. The two indexes describe [row, column]
Access can either return an element or a row or column. See below:
# vector indexing operations on matrices
a = np.arange(6).reshape(-1, 2)
print(f"a shape: {a.shape}, \na = {a}")
# result:
a shape: (3,2)
a = [[0 1]
[2 3]
[4 5]]
# access an element
print(f"\na[2,0].shape: {a[2,0].shape}, a[2,0] = {a[2,0]},
type(a[2,0]) = {type(a[2,0])} \naccessing an element returns a scalar")
# result:
a[2,0].shape: {}, a[2,0] = 4, type(a[2,0]) = <class 'numpy.int64'>
accessing an element returns a scalar
# access a row
print(f"a[2].shape: {a[2].shape}, a[2] = {a[2]}, type(a[2]) = {type(a[2])}")
#result:
a[2].shape: (2,), a[2] = [4,5], type(a[2]) = <class 'numpy.ndarray'>
Slicing
Slicing creates an array of indices using a set of 3 values (start:stop:step). A subset of values is also valid. Its use is best explained by an example:
# vector 2-D slicing operations
a = np.arange(20).reshape(-1, 10)
print(f"a = \n {a}")
# result
a =
[[0 1 2 3 4 5 6 7 8 9]
[10 11 12 13 14 15 16 17 18 19]]
# access 5 consecutive elements (start:stop:step)
print("a[0, 2:7:1] = ", a[0, 2:7:1], ", a[0, 2:7:1].shape =", a[0, 2:7:1].shape, "a 1-D array")
# result:
a[0, 2:7:1] = [2 3 4 5 6], a[0, 2:7:1].shape = (5,) a 1-D array
# access 5 consecutive elements in 2 rows (start:stop:step)
print("a[:, 2:7:1] = ", a[:, 2:7:1], ", a[:, 2:7:1].shape =", a[:, 2:7:1].shape, "a 2-D array")
# result:
a[:, 2:7:1] = [2 3 4 5 6]
[12 13 14 15 16]], a[:, 2:7:1].shape = (2,5) a 1-D array
# access all elements
print("a[:, :] = \n", a[:,:].shape = ", a[:,:].shape)
# result:
a[:,:] =
[[0 1 2 3 4 5 6 7 8 9]
[10 11 12 13 14 15 16 17 18 19]], a[:,:].shape = (2, 10)
# access all elements in one row
a[1,:] is the same as a[1]
a[1,:] or a[1] = [10 11 12 13 14 15 16 17 18 19], shape (10,) a 1-d array
ความคิดเห็น