Making a calculator from scratch -

2024 ж. 10 Мам.
28 800 Рет қаралды

Disclaimer: This video is rather programming heavy at points. You are welcome to skip these parts, but do note you might miss some small detail somewhere. These sections have the "Code:" prefix in the timestamps.
Evaluating math expressions is not an easy problem to solve, despite seeming extremely simple.
This video is a step by step guide to doing so, taking you through the motivations behind these steps as well.
This is my entry for the Summer of Math Exposition, hosted by @LeiosLabs and @3blue1brown
some.3b1b.co/
Timestamps:
0:00 - Intro
1:15 - Lexer/Tokenizer
3:04 - Code: Lexer Implementation
4:33 - Beyond Lexing
5:46 - Structuring Math Expressions
7:03 - Code: Representing AST Nodes
7:46 - Evaluating an AST
9:44 - Code: Basic Parsing Structure
11:02 - Order of Operations
12:12 - A general expression
13:57 - Colouring expressions
16:19 - Code: Parse Expression function logic
19:14 - Parenthesized Expressions
19:57 - Code: Parsing "Terminal" expressions
21:51 - Finished the basic pipeline
22:06 - Code Bonus: Implicit Multiplication
23:14 - Recap and outro
Links:
Tagged Unions - en.wikipedia.org/wiki/Tagged_...
Or possibly, a nicer alternative - blog.ryanmartin.me/tagged-unions
Code for this project in C - github.com/PixelRifts/math-ex...
Discord - / discord
Thanks to:
Garklein
Allen4th
IdkRandomTry
s m p l
Asher
GamesWithGabe
for your feedback
Music:
Somnia I - Reed Mathis
Interplanetary Alignment - NoMBe
Creeping Up on You - Godmode
#SoME3 #C #calculator
#MadeWithMotionCanvas

Пікірлер
  • Please note, this is *not* supposed to be the easiest way to make a calculator like this. As many have mentioned there are algorithms specifically for evaluating math expressions like shunting yard, which would probably be easier to implement. However this way of generating an AST has its own benefits: 1) This can be easily extended into an actual programming language 2) Since we get the full AST at the end, we can do some cool things with it if we just add an 'x' variable node. A few examples can be simplifying an AST by walking the tree and applying rules, or even creating an AST for the derivative of the given expression by analyzing the generated tree. Really the opportunities are endless here.

    @voxelrifts@voxelrifts8 ай бұрын
    • I have a question: so in the parse_terminal_expr it does if self.current.node_type == NodeType.NUMBER but then after that it also does if (self.current.node_type == NodeType.NUMBER or self.current.node_type == NodeType.OPEN_PARENTHESIS): which would be always be true if it passed thru the first one. but then in that if statement it does parse_expr which calls parse_terminal_expr and it just keeps going in a loop. I am trying to code your C code in python. Please tell me what I did wrong or how your C code doesnt go in a loop

      @jacobophoven90@jacobophoven908 ай бұрын
    • @@jacobophoven90 there's a few more conditions above the second condition you mentioned. Namely for unary plus and minus, which you can't implicit multiply with. Which is why they aren't in the list

      @voxelrifts@voxelrifts8 ай бұрын
  • I love how the word "AST", so as Abstract Syntax Tree, as the german word "Ast" translates to branch, wich is exactly what an AST is made out of

    @tomiivaswort6921@tomiivaswort69218 ай бұрын
  • As a compiler author, I really enjoyed this. Great video!

    @cynical5062@cynical50628 ай бұрын
    • A compiler author. The holy messiah amongst us all

      @tfr@tfr8 ай бұрын
  • In a way, this entire calculator is essentially an interpreter for a very simple (non-Turing complete) programming language, so it's completely natural that it'd be similar to other interpreters. It's also possible to argue that the lexer and parser are essentially simple compilers in their own right, translating a source string into more usable data, and then transforming it again into _more_ usable data. Really, the only difference between a compiler and an interpreter is that interpreters reduce their input to the final output on its own while compilers serialize the data it transformed the source to into a format the CPU and OS can understand. Most modern compilers just transform the input into something a more established compiler framework (usually LLVM) can understand and then pass that to said framework, at which point it can apply its own transformations to simplify the data so that the final serialization is smaller and/or faster. (You could even argue that a CPU is just an interpreter implemented in hardware. Modern CPUs also have hardware implemented _compilers_ too to transform the machine code even further into micro-operations that _then_ get interpreted.) I remember implementing a simple calculator with the Shunting Yard algorithm outputing RPN, which is basically just a flattened representation of the syntax tree, in, _of all things, DCPU-16 assembly. Gosh_ that was a long time ago...

    @angeldude101@angeldude1019 ай бұрын
    • Ai text?

      @berniusiii1627@berniusiii16273 ай бұрын
  • I was expecting you to talk about left- vs right-associativity, since exponentiation is right-associative. I think it's worth a mention at least.

    @odomobo@odomobo8 ай бұрын
    • Yeah, ditto - I actually thought the inclusion of exponentiation was deliberate as an opportunity to talk about associativity. It isn't too hard; it just changes whether "case 2" is handled like "case 1" or "case 3". But it does mean you have to track a bit more information about the operators.

      @rosuav@rosuav6 ай бұрын
  • The "loading" (or "waiting") animation on the tree nodes while evaluation works really well ! So visually simple, yet it clearly shows the order of operations.

    @EmKonstantin@EmKonstantin8 ай бұрын
  • I feel your pain, I've been trying off and on for half a year now to program my own calculator/CAS program, and my last iteration (before i stupidly decided to restart from scratch) somehow manages to evaluate Sums, integrals and derivatives analytically (so not just approximations with Riemann sums) and other stuff, but shits itself and dies (crashes my arduino) if it is asked to add 3 decimal numbers. It is very difficult to avoid technical debt, and program functions... in such a general way that you don't have to manually program in every interaction, but so concretely that they still do what you want them to

    @brummi9869@brummi98699 ай бұрын
    • I have yet to try it but it should definitely be possible (easy?) to analyze the AST and generate a new one for the derivative of a function. But calling anything easy is always famous last words

      @voxelrifts@voxelrifts9 ай бұрын
  • Thank you for taking the time to make a video on this topic :D I've been trying to get into low level programming and though I don't fully understand all the code and concepts in this video (on my first viewing), I hope to look into the stuff mentioned here and comeback later to help me implement my own version of this.

    @nevanjohn@nevanjohn9 ай бұрын
    • As in assembly?

      @AmCanTech@AmCanTech9 ай бұрын
    • @@AmCanTech even C and C++ are low level

      @robkojabko@robkojabko8 ай бұрын
  • This was a fantastic video. Well put together, clear and thorough without holding hands and filled with intuition. Great job!

    @albachteng@albachteng9 ай бұрын
  • I remember trying something similar during an assessment at uni. I wish I'd thought of doing a tokenising step first. Instead I tried to go straight from the string to the tree by finding the operator and splitting it into strings to the left and right of the operator. I remember having to add a separate processing stage for brackets and functions as well where I would split the string into 3 parts, whatever was before the function, after the function and the function parameter. I remember to avoid some bug I had to pre-process the string to insert a "bracket" function wherever there were brackets in the expression. I ran out of time before I could get it working properly, but still got an A. From memory, the thing that wasn't working was if there were multiple division operators they were evaluated in the wrong order.

    @yyattt@yyattt8 ай бұрын
  • This video is very useful for the compiler design course.

    @Apple-vm5gc@Apple-vm5gc9 ай бұрын
  • This is explained really well! I really like the part about the recursion and coloring :D Really makes this click for me!

    @gameofpj3286@gameofpj32868 ай бұрын
  • Nice! I think this approach (pratt parser) is the best way for building up towards writing a conventional programming language, but if you wanted to make a stack based language like dc or forth, or a s-expression lang like lisp, you could get a similar level of usability with a much simpler parser (don't need operator precedence in those languages). A neat trick for implementing operator precedence if you need it that i saw on wikipedia is to add brackets around the operators based on their precedence level, e.g. + -> ))+(( , * -> )*( , (-> (((, )-> ))) and surround the whole expression in (()). Apparently this was done in early fortran compilers.

    @Isaac-zy5do@Isaac-zy5do9 ай бұрын
    • You should look up "reverse polish"

      @palpytine@palpytine9 ай бұрын
  • Chidi Williams has neat writeups on parsing where he uses recursive descent for statements while pratt parsing is used for expressions. The code he uses is in TypeScript, but the concepts apply nicely in C as well.

    @SimGunther@SimGunther8 ай бұрын
  • I did this in rust a few months ago and it was such a fun project, recently I've been working on adding support for functions so you can use sin(), ln() and other stuff

    @fantasypvp@fantasypvp6 ай бұрын
  • I really learned a lot from this video. I have no idea that's how a calculator works, now i will look into it. ❤❤❤

    @laujimmy9282@laujimmy92828 ай бұрын
  • For a simple expression language like this, parsing straight to reverse polish notation is simpler, faster, and uses significantly less memory. The AST approach is better suited to a more complex language with multiple expressions where you're also adding a symbol table into the mix. It arguably does a better job of reporting errors in the expression.

    @palpytine@palpytine9 ай бұрын
    • That is indeed true. Infact I cut part of the video that got into parsing function calls and other things because it was just getting far too long and tedious and the concepts were already explained.

      @voxelrifts@voxelrifts9 ай бұрын
  • That sounds very interesting for working in non usual fields (rings, groups) pretty nice!!

    @vitorschroederdosanjos6539@vitorschroederdosanjos65398 ай бұрын
  • This is halfway to just straight-up being a programming language interpreter. Actually, it kind of already is, in a way.

    @NStripleseven@NStripleseven9 ай бұрын
    • Yes! Yes it is!

      @voxelrifts@voxelrifts9 ай бұрын
  • You did this so radically different to me. Who knew there were so many ways to make a calculator.

    @abdigex8255@abdigex82558 ай бұрын
  • Next you should talk about general parsing for functions and structures like you did in your compiler

    @isaacmccracken5892@isaacmccracken58929 ай бұрын
  • This video reminds me, I need to get back to work on my compiler I'm making. Need to start writing the type checker 😭😭

    @lazergenix@lazergenix8 ай бұрын
  • Thank you!

    @kasugaryuichi9767@kasugaryuichi97679 ай бұрын
  • sheesh that really makes me appreciate that 15$ casio calculator even more now

    @ahmadshami5847@ahmadshami58478 ай бұрын
  • My Simple Regex Parser was failing for a long time on unary operators. Also the fact that, they can take Nothing as an argument, made it very difficult: Such as (a|), which means that a or Nothing can come next.

    @chennebicken372@chennebicken3727 ай бұрын
  • Take a look at reverse polish notation! Transforming a token stream into this format makes evaluation trivial, as you can use a stack. Numbers will be pushed onto it and operators will take the top 2 numbers and apply the operation and push the result on the stack. The value on the stack after everything has been evaluated is the result.

    @wChris_@wChris_8 ай бұрын
    • Heh, I once made something that basically worked like that, but IMO it would have made a terrible explainer. It worked just fine (and I think that code is still in use in one of my old programs somewhere), but ultimately, the "convert to RPN" and "evaluate RPN" steps are very similar to a simplified version of "convert to AST" and "evaluate AST", with the tree being represented by two stacks (one for numbers, one for lower-precedence operators - in effect, the operator stack is for converting to RPN and the number stack is for evaluating). But this way is far far better at explaining how you would really want to think about this.

      @rosuav@rosuav6 ай бұрын
  • wow, awesome.

    @yash1152@yash11528 ай бұрын
  • Nice.

    @philtoa334@philtoa3349 ай бұрын
  • shunting yard

    @VioletJewel1729@VioletJewel17298 ай бұрын
  • Reverse hungarian notation and shunting yard? Interesting stuff, we used a stsck and did this in c++ ... gets quite nested and lots of if statements When i had white space i said +0 ...

    @AmCanTech@AmCanTech9 ай бұрын
  • 💙

    @yourfutureself4327@yourfutureself43278 ай бұрын
  • this video understand me

    @lilyzheng2322@lilyzheng23228 ай бұрын
  • When i was young... I wrote this from scratch without knowing any proper algorithms... it kinda worked by i stuck on minus followed by unary minus...

    @AK-vx4dy@AK-vx4dy8 ай бұрын
  • 5:19 f-bomb. i don't care at all; but i was kinda shocked. calm and informative video, and then in your face. unexpected. i can imagine some kid watching this video in front of strict parents etc and boom. now THAT kid's in trouble, because they clicked on YOUR video. (the reason i'm even commenting about this is because this happened to me many times, i've been that kid.) ranting aside, great video, +1 thumbs up, you've earned a sub.

    @9remi@9remi8 ай бұрын
    • That's fair xD, i doubt people younger than like... 16 would watch this video, but I will try to see whether I can in place edit it out :P

      @voxelrifts@voxelrifts8 ай бұрын
    • ​@@voxelriftsOh no you actually cut it :(

      @ciso@ciso8 ай бұрын
    • @@ciso xD yeah I did, better safe than sorry

      @voxelrifts@voxelrifts8 ай бұрын
  • First to grab this information 🎉

    @codealonewithGod@codealonewithGod9 ай бұрын
  • I asked chatgpt for an medium exercise and he gave me this lol, im struggling with operator precedence, where did you find all the information needed to solve this problem?

    @Xdetonando@Xdetonando7 ай бұрын
    • Combination of craftinginterpreters.com, Doing my own projects like a compiler and a math graphing game and also a bunch of tweaking and exploring.

      @voxelrifts@voxelrifts7 ай бұрын
  • I did my best to replicate the program however the expression always gets evaluated to 0, more precisely it doesn't know what to do when it reaches the end of file token, how are you supposed to handle that?

    @titaniumtomato7247@titaniumtomato72478 ай бұрын
    • There isn't anything special you need to do to handle eof (only make sure it is mapped to 0 precedence automatically or manually). When the parser reaches end of file, as long as there is no error in the expression, it will be looking for an operator there. So the parser can look into the precedences array/map with the eof tokentype which should return 0. Since this precedence is lower than every other precedence till that moment, the parser will just return out of all parse expression calls and return you an ast of the entire expression

      @voxelrifts@voxelrifts8 ай бұрын
  • 2:29 Why didn't you just make TokenType the enumerated type? E.g. enum TokenType { TT_EOF, TT_Error, TT_Ident .... }; You're defining the enumerated constants in an anonymous enumerated type, and then using them as numbers in a plain integer, losing the type checking. It's like you're missing the point of having an enumerated type, and you just ported code that had a bunch of #define's.

    @JohnDlugosz@JohnDlugosz9 ай бұрын
    • True, but that's really just the style I use for consistency. I do a lot of flags with enums so I use a typedef to typedef the name to an u32 then use an anonymous enum. It doesn't defeat the point of using the enum though, even if typechecking is lost there, the enum is still auto setting values with an increment. It also makes it possible to control the enum's base type. Also since this Token type enum is something that is being set on a token *once*, losing typechecking there is hardly a problem

      @voxelrifts@voxelrifts9 ай бұрын
  • how do you animate these videos with code?

    @Dg47PRO@Dg47PRO5 ай бұрын
    • I use motion canvas

      @voxelrifts@voxelrifts5 ай бұрын
  • Professionally you'd rarely code up your parsing from scratch, as there are quite a few standard tools well suited for a wide range of types of grammars. Anyhow this is a nice intro video for anyone unfamiliar with the topic.

    @HoSza1@HoSza19 ай бұрын
    • Yes, but I imagine the choice was made not to use one of those here because this is an educational video.

      @NStripleseven@NStripleseven9 ай бұрын
    • @@NStripleseven That's perfectly fine, and my addition here is to add a "where to go from here" section, which is customary in this genre.

      @HoSza1@HoSza18 ай бұрын
  • If you feel the need to explain what your abbreviated variable "ident" means. It probably shouldn't be abbreviated in the first place.

    @beanieteamie7435@beanieteamie74358 ай бұрын
    • You're not wrong, but ident is quite a standard abbreviation which many compilers use, which is why I explained what it meant

      @voxelrifts@voxelrifts8 ай бұрын
    • @@voxelrifts That's completely fair. Also, I'd like to say that I really enjoyed your video! I hope my previous comment didn't come off as hostile or anything 😅

      @beanieteamie7435@beanieteamie74358 ай бұрын
  • can you make a tutorial? would be nice to have a more detailed explanation step by step I mean

    @unLinuxeroMas@unLinuxeroMas7 ай бұрын
    • Is this not a tutorial? Definitely not super handholdy, but I do think I covered everything

      @voxelrifts@voxelrifts7 ай бұрын
    • @@voxelrifts thanks for the answer dude, and first of all is a incredible video maybe I am too newbie in programming to understand it to the fullest.I was referring to a full tutorial showing coding in real time

      @unLinuxeroMas@unLinuxeroMas6 ай бұрын
    • @@unLinuxeroMas Oh I don't plan on doing that sort of thing sorry. I don't think that helps people understand topics at all so I actively try to avoid showing code being typed line-by-line.

      @voxelrifts@voxelrifts6 ай бұрын
    • @@voxelrifts that is actually true , thanks for the answer again, I was asking for that kind of tutorial becose I want to see how is the program structured in the files , and what is that code that you put in a part of the screen when you are explaining the code? is that an separated file ? another function that you coded?, kinda dummy question but engish is not my first language that is why I may loose a bit of information that is said in the video XD

      @unLinuxeroMas@unLinuxeroMas6 ай бұрын
    • @@unLinuxeroMas that's understandable. I've put a github link in the description if you want to see how the thing is structured

      @voxelrifts@voxelrifts6 ай бұрын
  • The way your parser works looks a lot like the way Jon Blow said in this video: kzhead.info/sun/gNKcpKmPaKGCYH0/bejne.html Is that where you learned about the idea. Have you tried any other techniques, and what are your thoughts on those?

    @thanhlongtran9163@thanhlongtran91639 ай бұрын
    • No this is not where I learned about recursive descent parsing. I actually learned about it through crafting interpreters. The problem with that code is that it's encoded In a dispatch table which makes it hard to intuit what goes on. So what I did was unrolled that into switches and tried to figure out some cool things from it, like how the callstack mimics the expression, etc

      @voxelrifts@voxelrifts9 ай бұрын
    • @@voxelrifts -I was talking about the way your expression parser generates the correct tree order (recursive descent parsing is irrelevant)- So turns out this parser is also called recursive descent parser. That is what I think makes expression parsing different from other types of parsing. There are other algorithms like "shunting yard" or fixing up your tree order before/after returning. Have you tried any other techniques?

      @thanhlongtran9163@thanhlongtran91639 ай бұрын
    • ​@@thanhlongtran9163 I've heard of shunting yard but never implemented it myself yet. I did do the "fix up order after generating an improper tree" technique, and I much prefer this method honestly, but it's subjective. This method is nice because it's pretty much easily "embeddable" within a programming language parser.

      @voxelrifts@voxelrifts9 ай бұрын
    • @@voxelrifts What do you mean by "embeddable"? Is this the best method that you have tried so far? Also, unrelated, but I think this method is sometimes called Pratt parser, Precedence climbing, or Top-down parser.

      @thanhlongtran9163@thanhlongtran91639 ай бұрын
    • @@thanhlongtran9163 by embeddable I mean it can easily made part of a full programming language parser. Also yes this is called a Pratt parser (I mentioned the name in the video). This is definitely the most intuitive technique I've found till now though.

      @voxelrifts@voxelrifts9 ай бұрын
  • Promo-SM

    @jermainex364@jermainex3649 ай бұрын
  • why did you submit this to SoME?? this isn't mathematics.

    @schweinmachtbree1013@schweinmachtbree10137 ай бұрын
    • All subjects allied to Mathematics are allowed as declared on the website (Especially since the topic is math related). I also specifically asked on the discord to make sure it is valid :)

      @voxelrifts@voxelrifts7 ай бұрын
  • Really good video, however yoy may want to find a different way to phrase the thing at 5:19 to be more family friendly if you want the most people to see this.

    @tylerduncan5908@tylerduncan59089 ай бұрын
    • There's only one F bomb so it's PG13. I don't think people younger than 13 would be interested in the video anyway 😂

      @azmah1999@azmah19999 ай бұрын
  • harmful

    @neuvx@neuvx8 ай бұрын
KZhead