top of page

Day 1: Cost Function (examples)


yesterday's recap: we looked at a brief introduction to Machine Learning, linear regression, the model function, and cost function.

We have seen the mathematical definition of the cost function. Let's walk through an example to see how the cost function can be used to find the best parameters for your model.

Once we have our cost function J, our goal for linear regression would be to find the values for w and b, and then make Jw,b as small as possible. We'll be fixing our b value to 100 for this example, with different w values.

x_train = np.array([1.0, 2.0])        # size in 1000 sqft
y_train = np.array([300.0, 500.0])     # price in 1000s of dollar

Example 1: w = 150

Left Graph:

f(x_i) = w *x_i + b f(1.0) = (150 * 1.0) + 100 = 250

f(2.0) = (150 * 2.0) + 100 = 400

Right Graph:

J_1 = (250 - 300) ** 2 = 2,500

J_2 = (400 - 500) ** 2 = 10,000

final_J =(J_1 + J_2) / 2M = (2,500 + 10,000) / 2*2 = 3,125

As we can see, w = 150 may not be quite the right value from both graphs, the left graph shows the prediction points to be far away from the actual value.

While on the right graph, as we try to have cost J to be at a minimum, but with w=150, it arrives at 3125

Example 2: w = 200

Left Graph:

f(x_i) = w *x_i + b f(1.0) = (200 * 1.0) + 100 = 300

f(2.0) = (200 * 2.0) + 100 = 500

Right Graph:

J_1 = (300 - 300) ** 2 = 0

J_2 = (500 - 500) ** 2 = 0

final_J =(J_1 + J_2) / 2M = (0+ 0) / 2*2 = 0

In example 2, we tried w = 200, and this seems to be a better fit to our data.

On our left graph, we can see our prediction line, to be right at where the actual values are.

On our right graph, with our cost function, we can see that J = 0, which is at our minimum, and that correlates with our prediction being more in-line with our output variables.

In conclusion, the closer our cost function is to the minimum, the more accurate our prediction will be, and the goal of the cost function is to make it as small as possible.

Recent Posts

See All

Day 39: Tree Ensembles

Using Multiple Decision Trees One of the weaknesses of using a single decision tree is that decision tree can be highly sensitive to...


bottom of page