The Infinitesimal Calculus

Scroll 1

Aranfan

Team Plasma Grunt
It's time to learn Calculus bitches! The Infinitesimal kind, as Euler and Leibniz intended!

I shall assume that everyone reading this has a solid grasp on algebra and knows a bit of geometry.

PART 1: COORDINATE GEOMETRY​


In euclidean geometry you can draw a line between any two points, and extend said line indefinitely. You can draw a circle with the center at any point and the edge at any other.

So you pick two points and say the line between them is length 1. You extend this line indefinitely in both directions. Label one point (0,0) and the other (1,0) Use circles to mark off points at regular intervals of unit length. The point one unit length away from (0,0) on the opposite side from (1,0) is (-1,0). You now have directions. Call this line the X-Axis. A point on the X-axis can be uniquely specified by (x,0) where x is any real number.

By Euclid 1-11, it is possible to draw a line perpendicular to the X-Axis at (0,0). Do so to create the Y-Axis. Use circles to mark off points of unit length on it, label points on the Y-Axis (0, y) where y is any real number. Choose a (0,1) and (0,-1) for direction. It doesn't technically matter which side you pick, but convention is to say (0,1) is "above" the X axis.

By Euclid 1-12 it is possible to drop a perpendicular from any point not on a line to the line in question. By dropping a perpendicular from any point in the plane we can see where on the X and Y axes those perpendiculars will fall. and we can call that point (x₁,y₁) where x₁ and y₁ are the length from the origin to the perpendicular dropped from the point to the axis in question. It is obvious also that if you make perpendiculars at (x₁,0) and (0,y₁) they will intersect at that same point. Thus we are able to uniquely identify any point with an ordered pair of numbers (x,y).

By the Pythagorean Theorem the distance s between any two points is:
LaTeX:
\[ s=+ \sqrt{(x_1 - x_2)^2 + (y_1 - y_2)^2} \]
A little work will show that a line between any two points can be described by an equation:
LaTeX:
\[ (y - y_1)(x_1 - x_2)=(y_1 - y_2)(x-x_1) \]
And a circle with a center at a given point and a radius r is given by
LaTeX:
\[ (x-x_1)^2 + (y-y_1)^2 = r^2 \]

With this we can use algebra to do geometry. We can find lengths and intersections and tangents of circles and all the other things Euclid could do in two dimensions. Adding a Z-axis lets us do solid geometry, once we figure out the equations for a plane or a sphere. Be we also now have the tools to go beyond Euclid. To describe curves impossible to create with only a compass and straight edge. The Conic Sections can be described with only equations of degree 2, and Equations of degree 3 can describe curves never conceived of by the great Hellenistic Geometers of old.

The obvious questions arise however... can we find the tangents and areas of these new figures?

PART 2: TANGENTS​

To draw a secant line on a curve is dead simple, you take two points on the curve, and draw a line, and bam your done. Tangents are much, much harder. This is because with a Tangent you only have one point, because a tangent is a line that only touches a curve at one point. The equation for a line above collapses into 0=0 if both points are the same. You can't draw a line with only one point! Right?

Well. Not quite. The equation for a line I gave above is the most general form, but it's not the most common. The most common form of the equation of a line is
LaTeX:
\[ (y-y_1)=m(x-x_1) \\ m=\frac{y_1 - y_2}{x_1 - x_2} \]
The trick here is that for a line, m will be the same no matter what two points you pick on the line. The ratio of rise (change in y coordinate) over run (change in x coordinate) will always be the same no matter where on the line you are. So if we had some way to find m, independently of a second point... we'd be able to draw a line anyway. But we can't just choose a value for m willy nilly if we want to make a tangent. So we are once again stuck. We can draw a line with only one point if we have a slope, but the constraints on a tangent means we need to find m and the only way we know of spits out division by zero if we only give it one point.

So what if we didn't?

Have you ever noticed that the more sides a regular polygon has the more it looks like a circle? That the bigger the circle is, the more a small enough arc of it looks like a line? What if we zoomed in really far?

It is time to introduce the star of the infinitesimal calculus, the differential: df.

Let f be a function of a variable q and every other variable is in some way related to q but we don't know how. Then
LaTeX:
\[ df=f(q+h)-f(q) \\ f(q+h)=f(q)+df \\ df=(f(q)+df)-f(q) \]
Where the "d" stands for difference, and isn't a new variable but rather attaches to an existing one. You can't cancel out "d", but you can cancel out "dx" or "dy" or "du" or etc.

So what makes dx so special? Right now it just seems like a normal difference you could find using two points. The difference is that h is small, really small, infinitely small in fact. Which means that for any case where you are adding or subtracting h from an expression that doesn't involve h at all... it's negligible.

Lets use our new friend the differential to calculate the tangents to a simple curve like the parabola y=x^2. We see that rise over run is just a quotient of differences, so we want to find dy/dx:
LaTeX:
\[ y=x^2 \Rightarrow (y+dy)-y=(x+dx)^2-x^2 \\ dy=x^2+2xdx+(dx)^2-x^2 \\dy=2xdx+(dx)^2 \\ \frac{dy}{dx}=2x+dx \]
Now of course this isn't the slope of a tangent. This is a secant's... but note that every secant's slope where the points are only dx apart differ by dx. The appreciable part, if you will, of the slope of all these secants are... 2x. All of them. It doesn't matter which infinitesimal we use for h or dx, positive or negative. They all have the same appreciable part. So we can just round off the infinitesimal.

The equation for a tangent to a parabola y=x^2 at a point is
LaTeX:
\[ (y-y_1)=2x_1(x-x_1) \]
It has to be this, because it can't be anything else. A formal proof for every tangent calculation of this sort, that it can't be more nor less, would be exhausting, but in principle possible. I'm not going to bother doing it. However, note that the condition that the appreciable part not depend on the infinitesimal increment is very important. If the appreciable part of the ratio dy/dx in any way depends on dx, then we can't round off the infinitesimals and we can't infer a unique tangent line from the secants. This matters in cases where there is a sharp turn in the curve, like the absolute value at 0, or where a curve intersects itself.

Note that we can derive some rules for the differential. Applying d to a variable that is complicated enough can be broken down into easier steps.

By the binomial theorem, for integers you can see that:
LaTeX:
\[ d(x^n)=nx^{n-1}dx \]
To show this rule works for all exponents needs the use of the logarithm, but it does, in fact, work for all real numbers n.

The constant rule is that da is 0 when a is a constant. dπ? 0. This is clear because if f(x)=5, then it doesn't matter what the increment is, it'll be 5-5=0. Likewise d distributes over addition and subtraction. The more interesting rules to derive are the product and quotient rules
LaTeX:
\[ d(uv)=(u+du)(v+dv)-uv \\ d(uv)=uv+vdu+udv+dudv-uv \\ d(uv)=vdu+udv+dudv \\ d(uv)\approx vdu+udv \]
and:
LaTeX:
\[ d(\frac{u}{v})=\frac{u+du}{v+dv}-\frac{u}{v} \\ d(\frac{u}{v})=\frac{(u+du)v-u(v+dv)}{v(v+dv)} \\ d(\frac{u}{v})=\frac{uv+vdu-uv-udv}{v^2+vdv} \\ d(\frac{u}{v})=\frac{vdu-udv}{v^2+vdv} \approx \frac{vdu-udv}{v^2} \]
Where the vdv is discarded as negligible compared to v^2 and dudv is likewise discarded as negligible compared to udv+vdu. The chain rule, on the other hand is just multiplication by 1:
LaTeX:
\[ dy=dy \\ dy=dy \frac{du}{du} \\ dy=dy \frac{du}{du}\frac{dx}{dx} \\ dy=\frac{dy}{du}\frac{du}{dx}dx \\ \frac{dy}{dx}=\frac{dy}{du}\frac{du}{dx}\\\\ y=u^2, u=(2x+3) \\ dy=2udu=2u(2)dx=2(2x+3)(2)dx \]

PART 3: MAXIMA AND MINIMA​

Of course, once you are able to find tangents you can use them for all sorts of things! The most consistently useful of these is finding the highest and lowest values a of given curve. You just find where the differential quotient is zero and you have either a minimum or a maximum. The second derivative test can be used to find out which is which:
LaTeX:
\[ \frac{d(\frac{dy}{dx})}{dx}|_a<0 \Rightarrow y|_a=max \\\frac{d(\frac{dy}{dx})}{dx}|_a>0 \Rightarrow y|_a=min \]

Which makes sense since the second derivative tells you if the curve is concave up or concave down by its sign. If the second derivative is also zero, such as with y=x^3 at 0, then things have gone strange and the point in question may be neither maxima nor minima.

As an aside, I'd like to complain about prevailing notation. The most common way to write the second derivative of y wrt x is:
LaTeX:
\[ \frac{d^2y}{dx^2} \]
But this is wrong. The first derivative is a quotient, so you need to use the quotient rule:
LaTeX:
\[ d(\frac{dy}{dx})=\frac{ddydx-dyddx}{dx^2}\\ =\frac{d^2y}{dx}-\frac{dy}{dx}\frac{d^2x}{dx}\\ \frac{d(\frac{dy}{dx})}{dx}=\frac{d^2y}{dx^2}-\frac{dy}{dx}\frac{d^2x}{dx^2} \]
Which only reduces to the first notation if dx is a constant and ddx=0 by the constant rule. Unlike the first derivative, the typical notation of the second derivative will give you wrong answers if you treat it as an algebraically separable fraction, especially if the hidden assumption of x being the independent variable is violated. But the notation that uses the quotient rule does work out if treated as separable algebraic fractions.

PART 4: AREAS​

Unlike finding tangents, areas are easy to find... well they're easy to approximate. To find the area between a curve and the x axis between two points you just need to draw a bunch of rectangles that are very thin, and add up all the rectangles. The thinner the rectangles the better the approximation. Of course, you can't just have zero width, then you would just have a bunch of lines. In order for the dimensions to work out, the rectangles need to have a width.

Anyway, this is how the area between a curve and the x axis, and the lines x=a and x=b, is denoted in the calculus
LaTeX:
\[ \int_a^bydx \]

The fancy capital S stands for the infinite sum of all the rectangles of hight y and width dx between x=a and x=b. If you added up all those rectangles you would get the actual area... plus or minus an infinitesimal error term that could be rounded off, just like with the derivative. Note that this gives you the signed area. If the y value is negative, then the rectangle will have a negative sign attached to it. Likewise the dx has a direction, if you swap the bounds of integration you are flipping the sign of the dx and get a negative sign infront of every rectangle, which can then be pulled out of the integral by distributivity.

But this is all just academic theory. Adding up all those rectangles of y*dx would take literally forever.

PART 4a: THE FUNDAMENTAL THEOREM OF CALCULUS​

Unless there was a trick. It doesn't actually make intuitive sense for there to be a trick. Areas and Tangents don't seem like they should have anything to do with each other. But they do. It turns out that:
LaTeX:
\[ \int_a^b \frac{dy}{dx}dx \]
Is, under certain conditions, a telescoping sum. i.e.
LaTeX:
\[ \int_a^b \frac{dy}{dx}dx=\int_a^b1dy=y|_b-y|_a \]
This works because the differential is a difference
LaTeX:
\[ (y_1-y_2)+(y_2-y_3)=y_1-y_3 \]
And therefore the Infinite Sum and the Infinitesimal Difference are inverse operations, just like addition and subtraction are. Indeed, the integral is a sum of products and the derivative a quotient of differences. So long as the function is continuous, that is an infinitesimal dx corresponds to an infinitesimal dy, the integral of a derivative will telescope and collapse into an easy finite difference. Thus, so long as you don't try to integrate across a discontinuity, you can find areas exactly as long as you can find an anti-derivative. An example of where this condition fails is y=(1/x)^2 at 0. Infinitesimal dx corresponds to infinite dy, and you get plainly and obviously wrong answers if you try to treat the integral as a telescoping sum across zero.

So you're in business, so long as you can find an anti-derivative, you can find the area under a curve. The issue is that symbolic integration can sometimes be impossible using only algebraic functions. The anti-derivative of y=1/x cannot be expressed as a finite combination of additions, subtractions, multiplications, divisions, and root extractions. Even adding in the trig functions, their inverses, and the exponential and logarithmic functions only pushes the problem back. These functions are typically called the "elementary functions", and they are not closed under anti-differentiation. Still, it is useful to try and find symbolic anti-derivatives in terms of the elementary functions, because where we can find such expressions we can evaluate areas exactly.

Because mathematicians never let a symbol go to waste, the notation for anti-differentiation is exactly the same as for the integral, just without the bounds of integration.

PART 5: INTEGRALS​

Where the process of differentiating can be almost mechanical, the art of finding anti-derivatives is much fuzzier. Most methods of trying to find anti-derivatives is based on reversing the differentiation rules. Like the derivative, the integral distributes over addition and subtraction.

Integration by Parts is just the product rule in reverse:
LaTeX:
\[ d(uv)=udv+vdu \\ \int d(uv)=\int (udv+ vdu) \\ uv=\int udv+\int vdu \\ \int udv=uv-\int vdu \]
Where figuring out which parts of the integrand to make u and which to make dv in order to make it easier rather than harder to find an anti-derivative is mostly a matter of trained intuition. There is a lot of depth to doing integration by parts correctly. You can even get taylor series out of integration by parts and the fundamental theorem of calculus. As an aside, if that second line reminds you of path integrals, by the way, there's a reason for that.

Integration by Substitution is literally just the chain rule in reverse. With Trig Substitution just being fancy with u-sub
LaTeX:
\[ \int \frac{du}{dv}dv=\int du \]
Meanwhile, the power rule is again just the differential power rule in reverse:
LaTeX:
\[ \int x^ndx=\frac{x^{n+1}}{n+1} \]

Note that due to how any constant becomes zero when differentiated, when finding anti-derivatives you need to add a constant term to the expression. Which constant? We don't know enough to say unless given more information. So it is typically just noted as +C, and left as is.

PART 6: TRANSCENDENTAL FUNCTIONS​

There are a number of functions in math that "transcend" algebra. Most functions are transcendental, but the ones that actually are commonly encountered are fairly few. The Trig Functions, the Logarithmic Functions, and their inverses the arctrig functions and the exponential functions are the ones that actually were encountered before the invention of the Calculus dramatically expanded the range of functions people interacted with. So in this final part I'm going to show how to differentiate these functions from first principles... well the trig functions and the exponential and log anyway. The inverse trig functions are left as an exercise to the readers.

First is the trig functions. We only actually need one of them, if you know how to differentiate sine or cosine then the differentiation rules will let you differentiate any of the trig functions. But I'll go over both sine and cosine at the same time, because it's easy to do both at once when your working with differentials.

First we define the sine and cosine functions geometrically. Given a circle of radius r centered at the origin
LaTeX:
\[ \sin {\theta} = \frac{y}{r} \\ \cos {\theta} = \frac{x}{r} \]
Where (x,y) is the point on the circle where the ray from the origin at angle θ from the x axis intersects the circle. Letting s be the arclength, then from the definition of the circle and radians we know the following:
LaTeX:
\[ ds=rd\theta \\ x^2+y^2=r^2 \\ 2xdx+2ydy=0 \Rightarrow xdx+ydy=0 \\ ds^2=dx^2+dy^2 \]
This is enough to do some algebra:
LaTeX:
\[ ds^2=r^2d\theta^2=dx^2+dy^2 \\ =dx^2(1+(\frac{dy}{dx})^2)=dy^2((\frac{dx}{dy})^2+1)\\ =dx^2(1+(\frac{x}{y})^2)=dy^2((\frac{y}{x})^2+1) \\ =dx^2(\frac{y^2+x^2}{y^2})=dy^2(\frac{y^2+x^2}{x^2}) \\ r^2d\theta^2=r^2\frac{dx^2}{y^2}=r^2\frac{dy^2}{x^2} \\ d\theta^2=\frac{dx^2}{y^2}=\frac{dy^2}{x^2}\\ dx^2=(yd\theta)^2 \; , \; dy^2=(xd\theta)^2 \]
Looking at the circle we see that when a positive change in angle happens the sign of dy is the same as x, and the sign of dx is opposite y. Keeping in mind also that r is constant and can be moved in and out of the differential freely:
LaTeX:
\[ dx=-yd\theta \; , \; dy=xd\theta \\ \frac{dx}{r}=-\frac{y}{r}d\theta \; , \; \frac{dy}{r}=\frac{x}{r}d\theta \\ d\frac{x}{r}=-\frac{y}{r}d\theta \; , \; d\frac{y}{r}=\frac{x}{r}d\theta \\ d(\cos(\theta))= -\sin(\theta)d\theta \; , \; d(\sin(\theta))=\cos(\theta)d\theta \]

The logarithm was invented for a practical purpose: addition is easier than multiplication, so if we had a way to turn multiplication into addition we could spend less time in tedious computation and more time doing the interesting stuff. As such a logarithmic function is any non-zero function that obeys the following relationship:
LaTeX:
\[f(xy)=f(x)+f(y)\]
This is where we bring back the idea that the Integral is an area and not just the reverse of differentiation. Back before the days of calculus, when a master was explaining his findings on the quadrature of the hyperbola y=1/x, his student noticed the following:
LaTeX:
\[ \int_1^{ab}\frac{1}{x}dx=\int_1^{a}\frac{1}{x}dx+\int_a^{ab}\frac{1}{x}dx \\ \mathrm{let} \; x=au, u=\frac{x}{a} \\ dx=adu \\ \int_a^{ab}\frac{1}{x}dx=\int_1^{b}\frac{1}{au}adu =\int_1^{b}\frac{1}{u}du \\ \mathrm{so} \\ \int_1^{ab}\frac{1}{x}dx=\int_1^{a}\frac{1}{x}dx+\int_1^{b}\frac{1}{u}du \]
Which is to say that the function for the area under the curve of the hyperbola, measured starting from x=1, is given by a logarithmic function. Presently we don't know the base of the logarithm, but we do know that the derivative of this logarithm must be 1/x. Which is interesting, because a logarithmic function in any base is a constant multiple of any other base's logarithm. So the lack of any multiple means that this logarithm is in some sense natural, and it's corresponding exponential function must also be natural in the same sense. Lets call this "natural logarithm" ln(x).

The derivative of the inverse function is the reciprocal of the derivative. This makes sense because the derivative of y wrt x is the change in y divided by the change in x. Thus if you divide dx by dy instead it's like flipping the axes. Thus, for the exponential function of the same base, call that base "e" for now:
LaTeX:
\[ y=e^x \\ ln(y)=x \\ \frac{dy}{y}=dx \\ \frac{1}{y}=\frac{dx}{dy} \\ \frac{dy}{dx}=y=e^x \\ d(e^x)=e^xdx \]

The differentials of the inverse trig functions can be found in the same way, which is left as an exercise to the reader.

This finally lets us prove the power rule for all real numbers, rational and irrational:
LaTeX:
\[ d(x^n)=d(e^{n \ln (x)}) \\ =e^{n \ln (x)}d(n\ln(x)) \\ =x^n(\ln(x)dn+nd(\ln(x))) \\ =x^n(0+n\frac{dx}{x}) \\ =n\frac{x^n}{x}dx\\ =nx^{n-1}dx \]

And to round us off, here's the functional power rule:
LaTeX:
\[ d(u^v)=d(e^{v \ln(u)}) \\ =e^{v \ln(u)}d(v \ln(u)) \\ =u^v(\ln(u)dv+vd(\ln(u))) \\ =\ln(u)u^vdv+\frac{vu^v}{u}du \\ d(u^v)=\ln(u)u^vdv+vu^{v-1}du \]
 
Last edited:
You might be interested to hear that, starting in the 1960s, there has been modern work to put the idea of infinitesimals on a rigorous foundation under the name "non-standard analysis". The basic idea is that you can introduce infinitesimals into the real numbers in such a way that it doesn't change the truth value of any (first-order) logical statement about the real numbers, and then you can define limits, derivatives, integrals, and so on in terms of the relationship between "standard" and "non-standard" numbers. I found this series of blog posts to be a good introduction to the construction, though it doesn't talk about the application to real analysis at all.
 
By Euclid 1-11, it is possible to draw a line perpendicular to the X-Axis at (0,0).
One thing that always confused me and no teacher ever took time to explain, in physics land where we have no assumptions and the only starting tools are a straightedge and circle maker, how do we know the perpendicular line we make is truely 90° off of the x axis? The only requirement stated is that it goes off in a new direction from the x axis, but that applies to any line from 1° to 179°.
 
how do we know the perpendicular line we make is truely 90° off of the x axis?

Perpendicular lines in Euclidean space are definitionally at 90 degree angles to each other.

You can construct a perpendicular line using only a straightedge and compass, because the intersecting points of two circles that are both centered on a line will always be perpendicular to the line.

 
You can construct a perpendicular line using only a straightedge and compass, because the intersecting points of two circles that are both centered on a line will always be perpendicular to the line.

To confirm I understand, a circle with origin point of 1,0 and an edge of -1,0 and a 2nd circle with the reverse, the points they intersect are at 0x and some point y, so a new line drawn from 0,0 and one of those intersecting points would be a perfectly perpendicular y axis?
 
To confirm I understand, a circle with origin point of 1,0 and an edge of -1,0 and a 2nd circle with the reverse, the points they intersect are at 0x and some point y, so a new line drawn from 0,0 and one of those intersecting points would be a perfectly perpendicular y axis?

Correct, though you don't actually need defined distances to start with at all if you're starting with a blank slate.

1). Pick any two arbitrary points on a plane. Let's call them A and B.
2). Draw a line through the two points, and define that line as the x-axis.
3). Draw a circle centered at A with the edge at B, and a second circle centered at B with an edge at A.
4). Draw a line through the two intersecting points of the circles.

This second line has two properties of note:
a). It will be perpendicular to the first line, and so it can be defined as the y-axis, and the intersection with it and the x-axis can be defined as the origin.
b). The intersection of the two lines is located precisely at the midpoint of the points A and B,
meaning they can be defined as being located at (-1,0) and (1,0) respectively.
 
Thank you. I knew it had to be simple but nobody ever how showed how perpendicular was proven to be true which always bothered me since they took the time to prove other definitions.
 
To confirm I understand, a circle with origin point of 1,0 and an edge of -1,0 and a 2nd circle with the reverse, the points they intersect are at 0x and some point y, so a new line drawn from 0,0 and one of those intersecting points would be a perfectly perpendicular y axis?
Euclidean geometry works in the opposite direction. We start out with points, lines, circles, and angles as basic notions, and ideas like Cartesian coordinates are derived from them. Euclid's actual proof here uses the construction of an equilateral triangle. Suppose you start out with two points A and B, and construct two circles with radius AB, centered on A and B. Call on of the intersection points C. Then AC and BC are also radii of the two circles, so they must have the same length as AB, so the triangle is equilateral.

To get a perpendicular line, we start with two points A and D, then construct a circle with AD as the radius, then extend AD to a diameter of the circle, intersecting at B. Because AD and BD are both radii of the same circle they have equal length, so we can then do the above construction to get an equilateral triangle where one of the edges is already bisected. The line CD then divides the triangle ABC in two. Each half has sides of the same lengths, so the angles must also be equal, and the two angles at D must add up to a line. Two right angles make a line, by definition, so CD must be perpendicular to AB.
 
I truly need to read this topic when my brain is awake and I have the free time to take time. It seems interesting.
 
It's been about five years since I last took a math course and I feel like most of my calculus knowledge has evaporated out of my head. Or maybe I slept through too many classes. Thanks for the refresher though!
 
Damn, have you considered writing math textbooks? I read over this and you explained it more concisely and clearly than a lot of the books I remember using.
 
Damn, have you considered writing math textbooks? I read over this and you explained it more concisely and clearly than a lot of the books I remember using.

Alas, this would get laughed out of any math course. It's not very rigorous. The use of infinitesimals is also frowned on in teaching the infinitesimal calculus.

Also, I'm not very happy with how I did the logarithmic and exponential functions. I should have talked about how logarithms are solutions to the functional equation "f(xy)=f(x)+f(y)" and how the area under the hyperbola fits as a solution and is therefore a logarithm.
 
Thank you for this excellent exposition of SV's LaTeX engine, which I didn't know existed. Your explanation of Calculus is interesting, but unfortunately I had to pause at the first major issue I found for now, due to time constraints.

As an aside, I'd like to complain about prevailing notation. The most common way to write the second derivative of y wrt x is'd like to complain about prevailing notation. The most common way to write the second derivative of y wrt x is:
\[ \frac{d^2y}{dx^2} \]
But this is wrong. The first derivative is a quotient, so you need to use the quotient rule:
\[ d(\frac{dy}{dx})=\frac{ddydx-dyddx}{dx^2}\\ =\frac{d^2y}{dx}-\frac{dy}{dx}\frac{d^2x}{dx}\\ \frac{d(\frac{dy}{dx})}{dx}=\frac{d^2y}{dx^2}-\frac{dy}{dx}\frac{d^2x}{dx^2} \]
Which only reduces to the first notation if dx is a constant and ddx=0 by the constant rule. Unlike the first derivative, the typical notation of the second derivative will give you wrong answers if you treat it as an algebraically separable fraction, especially if the hidden assumption of x being the independent variable is violated. But the notation that uses the quotient rule does work out if treated as separable algebraic fractions.

Consider what it means to take the differential of x twice. By taking the differential of x in this context, we are treating x as a function of itself.
LaTeX:
\[ dx = (x+dx) - x \]

By taking the differential again, we are comparing the differential of x at x versus the differential at x+dx, i.e.
LaTeX:
\[ \begin{split} d(dx) &= [(x+dx+(dx)_2) - (x + dx)] - [(x+(dx)_2)-x]\\ &=(dx)_2 - (dx)_2\\ &=0 \end{split} \]

Which means everything you had to the right of the minus sign disappears. The fact that this is actually two linear steps is clearer when we use the more traditional notation for derivatives.
Standard definition of derivatives:
\[ f'(x) = lim_{h\to0}\frac{f(x+h)-f(x)}{h} \]

Note that the only components of f'(x) that are themselves functions of x are f(x+h) and f(x), both of which are in the numerator, so no quotient rule is involved when we take the derivative with respect to x again. In fact, the numerator is a linear combination of the two, so the end result will simply be a linear combination of their respective derivatives, like so

LaTeX:
\[ f''(x) = lim_{h'\to0}lim_{h\to0} \frac{[f(x+h+h')-f(x+h)] - [f(x+h')-f(x)]}{hh'} \]

By definition, if the second derivative exists, we are guaranteed a close enough estimate of it when both h and h' are "sufficiently small in absolute value" (the more technical terminology involves deltas and epsilons, which I will be happy to explain another time if anyone's interested). It's obvious that we can shift the larger of h and h' (in terms of absolute value) to be even smaller, let's say the same value as the other one, at which point we can take this new, identical value for both h and h' and call it dx. When we do this, the denominator for the above equation reduces to (dx)2​, which we write as dx2​ in shorthand. On the other hand, the numerator is simply the differential of the differential df, i.e. d(df)=d2​f. Hence the notation for the second derivative
LaTeX:
\[ \frac{d}{dx}(\frac{df}{dx}) = \frac{d^2f}{dx^2} \]
 
Thank you for this excellent exposition of SV's LaTeX engine, which I didn't know existed. Your explanation of Calculus is interesting, but unfortunately I had to pause at the first major issue I found for now, due to time constraints.



Consider what it means to take the differential of x twice. By taking the differential of x in this context, we are treating x as a function of itself.
LaTeX:
\[ dx = (x+dx) - x \]

By taking the differential again, we are comparing the differential of x at x versus the differential at x+dx, i.e.
LaTeX:
\[ \begin{split} d(dx) &= [(x+dx+(dx)_2) - (x + dx)] - [(x+(dx)_2)-x]\\ &=(dx)_2 - (dx)_2\\ &=0 \end{split} \]

Which means everything you had to the right of the minus sign disappears.


You fail at both the first step and the second. Take the differential of x twice doesn't mean we treat x as a function of itself, it means we are treating x as a function of some unknown variable. You are also treating d and x as separable, when the operator attatches to the variable creating a new variable dx. So lets take the differential of x twice correctly.
LaTeX:
\[ d(d(x))=d((x+dx)-x) \\ ddx=((x+dx)-x) + ((dx+ddx)-dx)-((x+dx)-x) \\ ddx=dx+ddx-dx \\ ddx=ddx \]

ddx is only equal to zero if dx is a constant. Which is only going to happen if x is a linear function of whatever the underlying variable is. x is obviously a linear function of itself, but it might actually be a nonlinear function of t. Hell, t might turn out to be a nonlinear function of some other variable, like happened when special relativity dropped.

To see an example of this, the obvious chain rule in leibniz notation for the second derivative very notably doesn't work if you use the traditional notation.
LaTeX:
\[ \frac{ddy}{dx^2}(\frac{dx}{dt})^2 \neq \frac{ddy}{dt^2} \]
Unless x is a linear function of t. Meanwhile:
LaTeX:
\[ \frac{d^2y}{dt^2}-\frac{dy}{dt}\frac{d^2t}{dt^2}=(\frac{d^2y}{dx^2}-\frac{dy}{dx}\frac{d^2x}{dx^2})(\frac{dx}{dt})^2+\frac{dy}{dx}(\frac{d^2x}{dt^2}-\frac{dx}{dt}\frac{d^2t}{dt^2}) \]
Holds generally, as can be seen after a ton of messy algebra.
 
Last edited:
You fail at both the first step and the second. Take the differential of x twice doesn't mean we treat x as a function of itself, it means we are treating x as a function of some unknown variable. You are also treating d and x as separable, when the operator attatches to the variable creating a new variable dx. So lets take the differential of x twice correctly.
LaTeX:
\[ d(d(x))=d((x+dx)-x) \\ ddx=((x+dx)-x) + ((dx+ddx)-dx)-((x+dx)-x) \\ ddx=dx+ddx-dx \\ ddx=ddx \]

ddx is only equal to zero if dx is a constant. Which is only going to happen if x is a linear function of whatever the underlying variable is. x is obviously a linear function of itself, but it might actually be a nonlinear function of t. Hell, t might turn out to be a nonlinear function of some other variable, like happened when special relativity dropped.

Suppose we have a curve representing a function f: X->Y in the Euclidean plane, i.e. y=f(x), which is differentiable on some domain (let's say all reals for simplicity's sake). The slope of the tangent, dy/dx, is uniquely determined by the point (x,y) it is tangent to; but since f is a function, x uniquely determines y as well, therefore dy/dx must be a function of x as well, let's call it f':X->dy/dx.

What you are claiming here is that the slope of the tangent to f' in the Euclidean (x, dy/dx) plane, df'/dx, is dependent on the choice of parametrical representation of x. Do you see the problem?

With respect to ddx, you are ignoring the fact that the differential is an operator on the set of functions, not numbers. df(z) = f(z+dz)-f(z), where z is a variable to be attached at another time (x in the case of a function in the (x,y) Euclidean plane). Otherwise you get a circular definition like above.

To see an example of this, the obvious chain rule in leibniz notation for the second derivative very notably doesn't work if you use the traditional notation.
LaTeX:
\[ \frac{ddy}{dx^2}(\frac{dx}{dt})^2 \neq \frac{ddy}{dt^2} \]
Unless x is a linear function of t. Meanwhile:
LaTeX:
\[ \frac{d^2y}{dt^2}-\frac{dy}{dt}\frac{d^2t}{dt^2}=(\frac{d^2y}{dx^2}-\frac{dy}{dx}\frac{d^2x}{dx^2})(\frac{dx}{dt})^2+\frac{dy}{dx}(\frac{d^2x}{dt^2}-\frac{dx}{dt}\frac{d^2t}{dt^2}) \]
Holds generally, as can be seen after a ton of messy algebra.

This has nothing to do with you subtracting 0 (represented in a convoluted way) and everything to do with properly applying the product rule to the first derivative. Everything you have to the right of the respective minus signs are still 0 and have no effect on the end result.
 
What you are claiming here is that the slope of the tangent to f' in the Euclidean (x, dy/dx) plane, df'/dx, is dependent on the choice of parametrical representation of x. Do you see the problem?

Incorrect. I am claiming that for a given curve, the choice of independent variable is arbitrary. By convention x is typically considered the independent variable, but y=e^x can just as well be a graph of the natural logarithm if y is chosen as an independent variable. Or both x and y can be coordinate functions of some vector valued function of t.

Like, we both agree that the second differential of an independent variable is 0, right? if x is the independent variable ddx=0. if t is the independent variable then ddt=0, but ddx might not be.

This has nothing to do with you subtracting 0 (represented in a convoluted way) and everything to do with properly applying the product rule to the first derivative. Everything you have to the right of the respective minus signs are still 0 and have no effect on the end result.

Then why does the algebra work out when you have the -(dy/dx)(ddx/dx^2) but not if you delete it? If you use the traditional notation, then the correct chain rule for the second derivative doesn't cancel out correctly. But if you use the quotient rule, it does. This is not a co-incidence, Leibniz chose his notation very carefully. But he didn't have the modern concept of functions. He had curves where the variables were interrelated and the equation a constraint on valid (x,y)-tuples. Very notably: a circle is not a function, but if his calculus couldn't handle a circle he'd be laughed out of the room for proposing it.

So you have an equation that gives relations between variables, with x and y just being considered the first among equals. There was also arclength, and subtangent, and subnormal, and a whole ton of others. Picking an independent variable sets ddx or ddy or dds to 0 and gives a ton of simplifications when working with second or higher order differentials... but those simplifications can be very different between different choices for independent variable.

Which is why you see Leibniz and the two Bernoulli he was contemporary with keep talking about "assume dx constant" or "assume dy constant". That's them assigning an independent variable so that they can simplify down expresions involving higher order differentials, because that algebra gets attrociously messy if you don't. But, of course, if you make that assumption you have to stick with it, which is why the algebra for the second derivative chain rule doesn't work out in the traditional notation, because the traditional notation assumes whatever is on the bottom of the derivative is the independent variable and the chain rule requires that at least one of the two variables on the bottom be dependent.
 
Incorrect. I am claiming that for a given curve, the choice of independent variable is arbitrary. By convention x is typically considered the independent variable, but y=e^x can just as well be a graph of the natural logarithm if y is chosen as an independent variable. Or both x and y can be coordinate functions of some vector valued function of t.

Like, we both agree that the second differential of an independent variable is 0, right? if x is the independent variable ddx=0. if t is the independent variable then ddt=0, but ddx might not be.

The relation to some fundamental independent variable is what I meant by "choice of parametrization" - and it is irrelevant when we are talking about the second derivative of y with respect to x. For all (x,y) in an arbitrary function f, if f is twice differentiable at (x,y), then the second derivative of y with respect to x is the same regardless of whether x is itself a function of some other variable.

Side note: formal definition of functions:
\[ f = \{(x,y)\lvert \forall a, b \in f, a\neq b \implies x_a\neq x_b\} \]

I think I need a clearer definition of what ddy means in your notation at this point.

Then why does the algebra work out when you have the -(dy/dx)(ddx/dx^2) but not if you delete it? If you use the traditional notation, then the correct chain rule for the second derivative doesn't cancel out correctly. But if you use the quotient rule, it does. This is not a co-incidence, Leibniz chose his notation very carefully. But he didn't have the modern concept of functions. He had curves where the variables were interrelated and the equation a constraint on valid (x,y)-tuples. Very notably: a circle is not a function, but if his calculus couldn't handle a circle he'd be laughed out of the room for proposing it.

So you have an equation that gives relations between variables, with x and y just being considered the first among equals. There was also arclength, and subtangent, and subnormal, and a whole ton of others. Picking an independent variable sets ddx or ddy or dds to 0 and gives a ton of simplifications when working with second or higher order differentials... but those simplifications can be very different between different choices for independent variable.

Which is why you see Leibniz and the two Bernoulli he was contemporary with keep talking about "assume dx constant" or "assume dy constant". That's them assigning an independent variable so that they can simplify down expresions involving higher order differentials, because that algebra gets attrociously messy if you don't. But, of course, if you make that assumption you have to stick with it, which is why the algebra for the second derivative chain rule doesn't work out in the traditional notation, because the traditional notation assumes whatever is on the bottom of the derivative is the independent variable and the chain rule requires that at least one of the two variables on the bottom be dependent.

That the algebra works out is not because of -(dy/dx)(ddx/dx^2), but because these are fundamentally different equations. The form that doesn't work out is based on the idea that:
Incorrect:
\[ \frac{d}{dt}(\frac{dy}{dt}) = \frac{d}{dx}(\frac{dy}{dx})\cdot(\frac{dx}{dt})^2 \]

The form that does work out looks instead like
Correct:
\[ \frac{d}{dt}(\frac{dy}{dt}) = \frac{d}{dx}(\frac{dy}{dx})\cdot (\frac{dx}{dt})^2 + \frac{dy}{dx}\cdot \frac{d}{dt}(\frac{dx}{dt}) \]
As I alluded to in my previous post, the extra term comes from the correct application of product rule to dy/dt = (dy/dx)(dx/dt), not from the extraneous term you added to the formula (which doesn't break anything, because it is equal to 0).

In fact, it is not generally possible to convert from the second derivative of y with respect to x to the second derivative of y with respect to t, even when x is related to t by an entire (that is, everywhere continuous and infinitely differentiable) function. For example, consider the information
LaTeX:
\[ \frac{d}{dx}(\frac{dy}{dx}) = 1, \forall x\in\mathbb{R}\\ x=t^3 \]
I claim it is impossible to determine the second derivative of y with respect to t based on this information alone. To the contrary:
One possble map between x and y:
\[ y=\frac{x^2}{2}=\frac{t^6}{2}\\ \frac{dy}{dt}=3t^5\\ \frac{d}{dt}(\frac{dy}{dt})=15t^4 \]
But:
A different map between x and y, that has the same second derivative of y with respect to x:
\[ y=\frac{x^2}{2}+x=\frac{t^6}{2}+t^3\\ \frac{dy}{dt}=3t^5+3t^2\\ \frac{d}{dt}(\frac{dy}{dt})=15t^4+6t \]
 
I think I need a clearer definition of what ddy means in your notation at this point.

ddy means the same thing it meant to Leibniz and Euler: the second difference of the successive values of y, when the increment is infinitesimal instead of unity or some other appreciable value.

That the algebra works out is not because of -(dy/dx)(ddx/dx^2), but because these are fundamentally different equations. The form that doesn't work out is based on the idea that:

You clearly don't understand what I mean by "the algebra works out" so I'll go through it slowly for you.

LaTeX:
\[ \frac{ddy}{du^2}(\frac{du}{dv})^2=\frac{ddy}{dv^2}=D_v^2y \;\ \mathrm{iff} \;\ u=av+b \\ \frac{ddy}{du^2}(\frac{du}{dv})^2=\frac{ddy}{dv^2} \neq D_v^2y \;\ \mathrm{if} \;\ u \neq av+b \]
Where a,b are some constant.
Meanwhile:
LaTeX:
\[ (\frac{d^2y}{du^2}-\frac{dy}{du}\frac{d^2u}{du^2})(\frac{du}{dv})^2+\frac{dy}{du}(\frac{d^2u}{dv^2}-\frac{du}{dv}\frac{d^2v}{dv^2})\\ \frac{d^2y(du)^2}{du^2dv^2}-\frac{dy}{du}\frac{d^2u(du)^2}{du^2dv^2}+\frac{dyd^2u}{dudv^2}-\frac{dydu}{dudv}\frac{d^2v}{dv^2}\\ \frac{d^2y}{dv^2}-\frac{dyd^2u}{dudv^2}+\frac{dyd^2u}{dudv^2}-\frac{dy}{dv}\frac{d^2v}{dv^2}\\ \frac{d^2y}{dv^2}+(-\frac{dyd^2u}{dudv^2}+\frac{dyd^2u}{dudv^2})-\frac{dy}{dv}\frac{d^2v}{dv^2}\\ \frac{d^2y}{dv^2}+(0)-\frac{dy}{dv}\frac{d^2v}{dv^2}\\ \frac{d^2y}{dv^2}-\frac{dy}{dv}\frac{d^2v}{dv^2} \]
Everything cancels out algebraically, just like the chain rule for the first derivative. This doesn't happen in the traditional notation because if u is a nonlinear function of v, then ddu doesn't vanish.

In fact, it is not generally possible to convert from the second derivative of y with respect to x to the second derivative of y with respect to t, even when x is related to t by an entire (that is, everywhere continuous and infinitely differentiable) function.
No shit. The differential loses constants and lowers the degree of polynomials. Even if a function is complex differentiable, with all the nice properties that entails, you still need as many initial conditions as times you've differentiated to recover the initial functions. I never said otherwise, and if I ever implied otherwise I miscommunicated and badly.
 
Last edited:
One thing that always confused me and no teacher ever took time to explain, in physics land where we have no assumptions and the only starting tools are a straightedge and circle maker, how do we know the perpendicular line we make is truely 90° off of the x axis? The only requirement stated is that it goes off in a new direction from the x axis, but that applies to any line from 1° to 179°.
I'm not sure what you mean by physics land (as opposed to math land?) but physics is done using the language of math. So all the mathematical stuff is basically baked in from the beginning, and as others have shown it can be proven that you can guaranteed draw a perpendicular line. A little beyond that though is that in physics your choice of coordinate system is arbitrary. You could have non-perpendicular axes it's just that, for introductory physics especially, it is usually very inconvenient to work in such a system.
 
ddy means the same thing it meant to Leibniz and Euler: the second difference of the successive values of y, when the increment is infinitesimal instead of unity or some other appreciable value.



You clearly don't understand what I mean by "the algebra works out" so I'll go through it slowly for you.

LaTeX:
\[ \frac{ddy}{du^2}(\frac{du}{dv})^2=\frac{ddy}{dv^2}=D_v^2y \;\ \mathrm{iff} \;\ u=av+b \\ \frac{ddy}{du^2}(\frac{du}{dv})^2=\frac{ddy}{dv^2} \neq D_v^2y \;\ \mathrm{if} \;\ u \neq av+b \]
Where a,b are some constant.
Meanwhile:
LaTeX:
\[ (\frac{d^2y}{du^2}-\frac{dy}{du}\frac{d^2u}{du^2})(\frac{du}{dv})^2+\frac{dy}{du}(\frac{d^2u}{dv^2}-\frac{du}{dv}\frac{d^2v}{dv^2})\\ \frac{d^2y(du)^2}{du^2dv^2}-\frac{dy}{du}\frac{d^2u(du)^2}{du^2dv^2}+\frac{dyd^2u}{dudv^2}-\frac{dydu}{dudv}\frac{d^2v}{dv^2}\\ \frac{d^2y}{dv^2}-\frac{dyd^2u}{dudv^2}+\frac{dyd^2u}{dudv^2}-\frac{dy}{dv}\frac{d^2v}{dv^2}\\ \frac{d^2y}{dv^2}+(-\frac{dyd^2u}{dudv^2}+\frac{dyd^2u}{dudv^2})-\frac{dy}{dv}\frac{d^2v}{dv^2}\\ \frac{d^2y}{dv^2}+(0)-\frac{dy}{dv}\frac{d^2v}{dv^2}\\ \frac{d^2y}{dv^2}-\frac{dy}{dv}\frac{d^2v}{dv^2} \]
Everything cancels out algebraically, just like the chain rule for the first derivative. This doesn't happen in the traditional notation because if u is a nonlinear function of v, then ddu doesn't vanish.


No shit. The differential loses constants and lowers the degree of polynomials. Even if a function is complex differentiable, with all the nice properties that entails, you still need as many initial conditions as times you've differentiated to recover the initial functions. I never said otherwise, and if I ever implied otherwise I miscommunicated and badly.

Thank you for an actual worked example of your math process. I would prefer an actual definition of ddy, or at least a worked example involving finding ddy/dx^2 in the context of concretely defined (x,y), but this is enough to pin down at least one of your errors.

You implicitly used the following equation in your algebra
Left-most component, transformation from 2nd to 3rd line:
\[ \frac{d^2y}{du^2}(\frac{du}{dv})^2 = \frac{d^2y}{dv^2} \]

Which, if generally true, leads to rather unusual results, like when we choose a point where du/dv=0
LaTeX:
\[ \frac{du}{dv} = 0 \implies \frac{d^2y}{dv^2}=\frac{d^2y}{du^2}\cdot 0 = 0\\ \begin{split} \implies \frac{d}{dv}(\frac{dy}{dv}) &= \frac{d^2y}{dv^2}-\frac{dy}{dv}\frac{d^2v}{dv^2} \text{(by your formula for 2nd derivative)}\\ &=\frac{d^2y}{dv^2}-\frac{dy}{du}\frac{du}{dv}\frac{d^2v}{dv^2}\\ &=0-\frac{dy}{du}\cdot 0 \cdot \frac{d^2v}{dv^2} \text{(by first line)}\\ &=0 \end{split} \]

But it's easy enough to provide a counter-example
Counter example where du/dv=0 at v_0, but the second derivative of y with respect to v is non-zero:
\[ y=e^u\\ u=v^2\\ \frac{dy}{dv} = (2v)e^{v^2}\\ \frac{d}{dv}(\frac{dy}{dv}) = 2e^{v^2} + (2v)^2e^{v^2}\\ \frac{du}{dv} = 2v\\ \frac{du}{dv}\rvert_{v=0}=0\\ \frac{d}{dv}(\frac{dy}{dv})\rvert_{v=0}=2e^0+0e^0=2 \neq 0\\ \]

Quod erat demonstrandum.
 
Thank you for an actual worked example of your math process. I would prefer an actual definition of ddy, or at least a worked example involving finding ddy/dx^2 in the context of concretely defined (x,y), but this is enough to pin down at least one of your errors.

Like I said, ddy is the second infinitesimal difference of y values. But sure, I'll do a worked example of finding that particular quotient in terms of a concretely defined curve.
LaTeX:
\[ y=u^2\\ dy=2udu, \\ ddy=2du^2+2uddu \\ \frac{ddy}{du^2}=2+2u\frac{ddu}{du^2}\\ \frac{ddy}{du^2}-2u\frac{ddu}{du^2}=2 \\ \mathrm{let} \;\ u=v^3\\ du=3v^2dv \\ ddu=6vdv^2+3v^2ddv\\ ddy=2(3v^2dv)^2+2(v^3)(6vdv^2+3v^2ddv) \\ ddy=18v^4dv^2+12v^4dv^2+6v^5ddv \\ ddy=30v^4dv^2+6v^5ddv \\ \frac{ddy}{dv^2}=30v^4+6v^5\frac{ddv}{dv^2} \\ \frac{ddy}{dv^2}-6v^5\frac{ddv}{dv^2}=30v^4 \]

You implicitly used the following equation in your algebra
Left-most component, transformation from 2nd to 3rd line:
\[ \frac{d^2y}{du^2}(\frac{du}{dv})^2 = \frac{d^2y}{dv^2} \]

Which, if generally true, leads to rather unusual results, like when we choose a point where du/dv=0
LaTeX:
\[ \frac{du}{dv} = 0 \implies \frac{d^2y}{dv^2}=\frac{d^2y}{du^2}\cdot 0 = 0\\ \begin{split} \implies \frac{d}{dv}(\frac{dy}{dv}) &= \frac{d^2y}{dv^2}-\frac{dy}{dv}\frac{d^2v}{dv^2} \text{(by your formula for 2nd derivative)}\\ &=\frac{d^2y}{dv^2}-\frac{dy}{du}\frac{du}{dv}\frac{d^2v}{dv^2}\\ &=0-\frac{dy}{du}\cdot 0 \cdot \frac{d^2v}{dv^2} \text{(by first line)}\\ &=0 \end{split} \]

But it's easy enough to provide a counter-example
Counter example where du/dv=0 at v_0, but the second derivative of y with respect to v is non-zero:
\[ y=e^u\\ u=v^2\\ \frac{dy}{dv} = (2v)e^{v^2}\\ \frac{d}{dv}(\frac{dy}{dv}) = 2e^{v^2} + (2v)^2e^{v^2}\\ \frac{du}{dv} = 2v\\ \frac{du}{dv}\rvert_{v=0}=0\\ \frac{d}{dv}(\frac{dy}{dv})\rvert_{v=0}=2e^0+0e^0=2 \neq 0\\ \]

Quod erat demonstrandum.

You haven't demonstrated shit, other than the ability to get wrong answers when you divide by zero. If a quotient equals zero, that means the numerator equals zero. Which means that if du/dv=0, du=0. So when you multiply by du/du in your example, you aren't multiplying by 1, you're multiplying by the indeterminate form 0/0. Also, the ddy/du^2 in your first line is undefined, because again, division by zero.
 
By that logic, the chain rule for the first derivative would give an indefinite result whenever the inner function is at a critical point (i.e. derivative is 0). That is not actually the case.

Derivatives are pretty much conversion factors. dy/du describes the rate at which a small change in u effects a corresponding change in y. Take the chain rule for the first derivative as an example, a change in u causes a change in y, purely through the intermediate of effecting a change in v. If, locally, a change in u causes no change in v (i.e. du/dv=0), then clearly the lack of change in u fails to convert to any change in y. A tactile reader may imagine v, u, and y as gears where v is not connected to u, and u connects to y.

It is true that dy/dv can be non-zero when du/dv=0, but that has nothing to do with du=0; rather, it happens when dy/du is undefined at the point in question, and the slopes of secant lines exhibit unbounded behaviour when approaching that point. The necessity of the second part is easy to see when we use concepts from the modern (1821, when Cauchy published his Cours d'Analyse, is pretty modern, right?) limit conception of calculus. The sketch of the proof goes like this: if the slopes of (u,y) secant lines were bounded on an interval around the point of interest, let's say they all have absolute value less than M, then we can choose a small interval around the point of interest such that the absolute slope of (v,u) secant lines are always less than ϵ/M (such an interval exists because du/dv=0 at the point of interest). Then the absolute slopes of (v,y) secants are guaranteed to be less than M*(ϵ/M)=ϵ; and we can find such an interval for any small positive ϵ you define as "close enough" to 0. Therefore dy/dv=0 by definition of limits, which violates the non-zero dy/dv requirement, done.

All of which goes to show that calculus isn't really algebraic (derivatives and antiderivates are linear operators on the vector space of functions, however).


Before we go further, let us find closed forms for the terms of your invention. For any variables y, u, v such that y=f(u) and u=g(v), where f and g are functions, the following holds when du/dv is not 0.

LaTeX:
\[ \frac{ddy}{du^2}= \begin{cases} \frac{d}{dv}(\frac{dy}{dv}) & \text{if g is the identity function}\\ \frac{d}{dv}(\frac{dy}{dv})(\frac{du}{dv})^{-2} & \text{otherwise} \end{cases} \]

Remark: ddy/(du^2) is undefined whenever f''(u) is non-zero at a critical point of g(v). In fact, ddy/(du^2) will approach either positive or negative infinity, depending on the sign of f''(u), as we approach the critical point (assuming f''(u) and g'(v) are both continuous).

Now that we have the closed form for ddy/du^2, it's easy enough to see it can be derived from the chain rule for the second derivative.

Suppose du/dv is non-zero:
\[ \frac{d}{dv}(\frac{dy}{dv}) =( \frac{d}{du}(\frac{dy}{du}))(\frac{du}{dv})^2 + \frac{dy}{du}\cdot\frac{d}{dv}(\frac{du}{dv})\\ \frac{d}{du}(\frac{dy}{du})(\frac{du}{dv})^2 = \frac{d}{dv}(\frac{dy}{dv}) - \frac{dy}{du}\cdot\frac{d}{dv}(\frac{du}{dv})\\ \frac{d}{du}(\frac{dy}{du}) = \left[(\frac{d}{dv}(\frac{dy}{dv}))(\frac{du}{dv})^{-2}\right] - \frac{dy}{du}\cdot\left[(\frac{d}{dv}(\frac{du}{dv}))(\frac{du}{dv})^{-2}\right] \]

Your definition of ddy/(du^2) is therefore an inverse of the process by which we calculate the second derivative of y with respect to v, using dy/du, du/dv and the second derivatives of y with respect to u, and u with respect to v. However, this process is not generally invertible when du/dv=0, as I have previously demonstrated.

So basically you have taken a property that is intrinsic to the function f and given it a definition that is unnecessarily dependent on some other function g; and indeed, this definition is not even well-defined everywhere.
 
Back
Top