<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet href="/rss.xsl" type="text/xsl"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>GeekCoding101 - Make your way to geek</title><description>GeekCoding101: Explore AI tools, LLMs, and machine learning with expert tutorials, insights, and resources to boost your coding skills and stay ahead in tech.</description><link>https://geekcoding101.com</link><item><title>Git Notes</title><link>https://geekcoding101.com/posts/git-notes</link><guid isPermaLink="true">https://geekcoding101.com/posts/git-notes</guid><pubDate>Sun, 07 Jan 2018 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Hi there!&lt;/p&gt;
&lt;p&gt;Recently, I&apos;ve spent some time to organize my git commands notes. I know you can find those commands online easily, but I would like to share these what I think useful and put them together here for my own references. Let&apos;s take a look!&lt;/p&gt;
&lt;h1&gt;General Settings&lt;/h1&gt;
&lt;ol&gt;
&lt;li&gt;Setting Editor for git. This would be useful when writing &lt;strong&gt;Commit Messages&lt;/strong&gt;. When committing changes without specifying a message inline (using &lt;code&gt;git commit -m &quot;message&quot;&lt;/code&gt;), git opens the default editor to let you write a commit message. Without a predefined editor, it might not know which text editor to open, especially when you have your own preferences for editing messages.&lt;br /&gt;
In my case, I like vim, so here comes how I set it. Simple.&lt;/li&gt;
&lt;/ol&gt;
&lt;pre&gt;&lt;code&gt;git config --global core.editor &quot;vim&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;ol&gt;
&lt;li&gt;Setting Committer Name &amp;amp; Email Globally. Of course, this is a must.&lt;/li&gt;
&lt;/ol&gt;
&lt;pre&gt;&lt;code&gt;git config user.name &quot;your name&quot;
git config user.email &quot;your_email@email.com&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;If you want to apply Committer Name &amp;amp; Email per repository, just simply omit the --global flag from above commands and run it from your repository folder.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Changing the Author Information Just for the Next Commit&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;pre&gt;&lt;code&gt;git commit --author=&quot;your name &amp;lt;your_email@email.com&amp;gt;&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;ol&gt;
&lt;li&gt;Show all commiter/author in log&lt;/li&gt;
&lt;/ol&gt;
&lt;pre&gt;&lt;code&gt;git log --pretty=&quot;%an %ae%n%cn %ce&quot; | sort | uniq
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;%an author name
%ae author email
%n  new line
%cn committer name
%ce committer email
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You will notice that, for each commit, it has both author name and committer name.&lt;/p&gt;
&lt;h1&gt;Branches&lt;/h1&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operations&lt;/th&gt;
&lt;th&gt;Command&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Get current working branch&lt;/td&gt;
&lt;td&gt;&lt;code&gt;git branch&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Checkout specific branch&lt;/td&gt;
&lt;td&gt;&lt;code&gt;git clone -b specific_branch --single-branch http://username@192.168.99.100:8080/scm/your-repo.git&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Create a branch&lt;/td&gt;
&lt;td&gt;&lt;code&gt;git checkout -b new_branch&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Push to remote branch&lt;/td&gt;
&lt;td&gt;&lt;code&gt;git push origin remote_branch&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Delete local branch&lt;/td&gt;
&lt;td&gt;&lt;code&gt;git branch -d &amp;lt;branch&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Delete remote branch&lt;/td&gt;
&lt;td&gt;&lt;code&gt;git push origin :[name_of_your_new_branch]&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Notes:&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&quot;git branch &amp;lt;branch_name&amp;gt;&quot; : The repository history remains unchanged. All you get is a new pointer to the current commit.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The &quot;git branch &amp;lt;branch_name&amp;gt;&quot; only creates the new branch. To start adding commits to it, you need to select it with git checkout, and then use the standard git add and git commit commands.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Delete branch: &quot;-d&quot; is a “safe” operation in that Git prevents you from deleting the branch if it has unmerged changes. The only difference is the &quot;:&quot; to say delete, you can do it too by using github interface to remove branch : &lt;a href=&quot;https://help.github.com/articles/deleting-unused-branches&quot;&gt;https://help.github.com/articles/deleting-unused-branches&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/blockquote&gt;
&lt;h1&gt;Fast-forward merge&lt;/h1&gt;
&lt;ol&gt;
&lt;li&gt;Start a new feature&lt;/li&gt;
&lt;/ol&gt;
&lt;pre&gt;&lt;code&gt;git checkout -b new-feature master
&lt;/code&gt;&lt;/pre&gt;
&lt;ol&gt;
&lt;li&gt;Edit some files&lt;/li&gt;
&lt;/ol&gt;
&lt;pre&gt;&lt;code&gt;git add &amp;lt;file&amp;gt;
git commit -m &quot;Start a feature&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;ol&gt;
&lt;li&gt;Edit some files&lt;/li&gt;
&lt;/ol&gt;
&lt;pre&gt;&lt;code&gt;git add &amp;lt;file&amp;gt;
git commit -m &quot;Finish a feature&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;ol&gt;
&lt;li&gt;Merge in the new-feature branch&lt;/li&gt;
&lt;/ol&gt;
&lt;pre&gt;&lt;code&gt;git checkout master  (This will switch to master branch)
git merge new-feature (This will merge your changes from new-feature branch to master)
git branch -d new-feature
&lt;/code&gt;&lt;/pre&gt;
&lt;h1&gt;Git Clone&lt;/h1&gt;
&lt;h2&gt;Clone into current directory&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;git init .
git remote add origin &amp;lt;repository-url&amp;gt;
git pull origin master
&lt;/code&gt;&lt;/pre&gt;
&lt;h1&gt;Git Checkout&lt;/h1&gt;
&lt;h2&gt;Check out files deleted locally&lt;/h2&gt;
&lt;p&gt;Sometimes, you might accidently delete some files in your local repos. Then you can use below command to pull them back from remote:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;git checkout HEAD &amp;lt;path&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Clone a subdirectory only of a Git repository&lt;/h2&gt;
&lt;p&gt;What you are trying to do is called a &lt;strong&gt;sparse checkout&lt;/strong&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;mkdir &amp;lt;repo&amp;gt;
cd &amp;lt;repo&amp;gt;
git init
git remote add -f origin &amp;lt;https://product.git&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This creates an empty repository with your remote, and fetches all objects but doesn&apos;t check them out. Then do:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;git config core.sparseCheckout true
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now you need to define which files/folders you want to actually check out. This is done by listing them in .git/info/sparse-checkout, eg:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;echo &quot;temp&quot; &amp;gt;&amp;gt; .git/info/sparse-checkout
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Last but not least, update your empty repo with the state from the remote:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;git pull origin master
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You will now have files &quot;checked out&quot; for &quot;temp&quot; folder on your file system, and no other paths present.&lt;/p&gt;
&lt;h2&gt;Clone specific branch of a Git repository&lt;/h2&gt;
&lt;p&gt;Just use &lt;strong&gt;singel-branch&lt;/strong&gt; option:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;git clone -b &amp;lt;branch tag&amp;gt; --single-branch https://user.name@product.git
&lt;/code&gt;&lt;/pre&gt;
&lt;h1&gt;Git Remote&lt;/h1&gt;
&lt;p&gt;I was wondering what is a &lt;code&gt;git remote&lt;/code&gt;, here is:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;A remote in git is basically a bookmark for a different repository from which you may wish to pull or push code. The bookmarked repository may be on your local computer in a different folder, on remote server, or it may even be the repository itself ( I haven&apos;t tried this ) but the simplest analogy is a bookmark. The repository doesn&apos;t even have to be a version of your repository, it may even be a completely unrelated repository.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Other explanations:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;As you probably know, git is a distributed version control system. Most operations are done locally. To communicate with the outside world, git uses what are called remotes. These are repositories other than the one on your local disk which you can push your changes into (so that other people can see them) or pull from (so that you can get others changes). The command git remote add origin &lt;a href=&quot;mailto:git@github.com&quot;&gt;git@github.com&lt;/a&gt;:peter/first_app.git creates a new remote called origin located at &lt;a href=&quot;mailto:git@github.com&quot;&gt;git@github.com&lt;/a&gt;:peter/first_app.git. Once you do this, in your push commands, you can push to origin instead of typing out the whole URL. Is the word &apos;origin` is arbitrary? Yes.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h1&gt;Git Log&lt;/h1&gt;
&lt;pre&gt;&lt;code&gt;git log -1 --graph   --name-only  feature/your-feature
* commit d1f669674305665f9e6b8914511ed709aa8f09xb (HEAD -&amp;gt; feature/your-feature, origin/feature/your-feature)
| Author: xx.author &amp;lt;xx.author@yourdomain.com&amp;gt;
| Date:   Wed Jul 26 15:14:48 2017 -0700
|
|     Your comments.
|
| your-source-code/file.sh
| your-source-code/file02.sh

git log -1 --graph   --name-only --pretty=oneline feature/your-feature
* d1f669674305665f9e6b8914511ed709aa8f0x2b (HEAD -&amp;gt; feature/your-feature, origin/feature/your-feature) Your comments.
| your-source-code/file.sh
| your-source-code/file02.sh
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The log command takes a --follow argument that continues history before a rename operation:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;git log --follow ./renamed_path/to/file
&lt;/code&gt;&lt;/pre&gt;
&lt;h1&gt;Git Diff&lt;/h1&gt;
&lt;pre&gt;&lt;code&gt;git diff HEAD your_file
git diff HEAD@{1} your_file    --&amp;gt; The @{1} means &quot;the previous position of the ref I&apos;ve specified&quot;, so that evaluates to what you had checked out previously - just before the pull.
git diff HEAD^                 --&amp;gt; This will diff all files which have been changed in previous commit.
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Diff that changed between two commits&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;git diff --word-diff SHA1 SHA2
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you only need last commit diff with previous one:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;git show
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you only need names and comments:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;git show --name-only
&lt;/code&gt;&lt;/pre&gt;
&lt;h1&gt;Git Stash&lt;/h1&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operations&lt;/th&gt;
&lt;th&gt;Command&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Stash&lt;/td&gt;
&lt;td&gt;git stash&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bring back your local changes&lt;/td&gt;
&lt;td&gt;git stash pop&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;code&gt;git stash pop&lt;/code&gt; throws away the (topmost, by default) stash after applying it, whereas &lt;code&gt;git stash apply&lt;/code&gt; leaves it in the stash list for possible later reuse (or you can then git stash drop it).&lt;/p&gt;
&lt;p&gt;This happens unless there are conflicts after &lt;code&gt;git stash pop&lt;/code&gt;, in this case, it will not remove the stash, behaving exactly like git stash apply.&lt;/p&gt;
&lt;p&gt;Another way to look at it: &lt;code&gt;git stash pop&lt;/code&gt; is &lt;code&gt;git stash apply&lt;/code&gt; &amp;amp;&amp;amp; &lt;code&gt;git stash drop&lt;/code&gt;.&lt;/p&gt;
&lt;h1&gt;Git Squash&lt;/h1&gt;
&lt;h2&gt;Why do we need git squash and how it helps?&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;https://softwareengineering.stackexchange.com/questions/263164/why-squash-git-commits-for-pull-requests&quot;&gt;https://softwareengineering.stackexchange.com/questions/263164/why-squash-git-commits-for-pull-requests&lt;/a&gt; has collected a lot of great explanation. Among those answers, I am in more favor of this point:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Because often the person pulling a PR cares about the net effect of the commits &quot;added feature X&quot;, not about the &quot;base templates, bugfix function X, add function Y, fixed typos in comments, adjusted data scaling parameters, hashmap performs better than list&quot;... level of detail&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Steps:&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Set your current branch: &lt;code&gt;export curr_branch=&quot;&amp;lt;your_current_branch&amp;gt;&quot;&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;git log --oneline origin/master..$curr_branch&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;git reset --soft `git merge-base origin/master $curr_branch`&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;git commit -c &amp;lt;hash string of one of your previous msg&amp;gt;&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;git diff HEAD &amp;lt;your_file&amp;gt;&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;git push --force&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h1&gt;Revert-Reset&lt;/h1&gt;
&lt;h2&gt;Undo a commit and Redo&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;$ git commit -m &quot;Something comments&quot;
$ git reset HEAD~
&amp;lt;&amp;lt; edit files as necessary &amp;gt;&amp;gt;
$ git add ...
$ git commit -c ORIG_HEAD
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;For the last command, if you do not need to edit the message, you could use the -C option.&lt;/p&gt;
&lt;h2&gt;Revert a commit already pushed to a remote repository&lt;/h2&gt;
&lt;h3&gt;Revert with log history for tracing rollback opration&lt;/h3&gt;
&lt;pre&gt;&lt;code&gt;$ git revert &amp;lt;commit hash&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It will delete a previous commit(ab12cd15) from local branch and remote branch, but you will get a log.&lt;/p&gt;
&lt;h3&gt;Revert even without any log trace for rollback operation&lt;/h3&gt;
&lt;p&gt;You just commited a change to your local branch and immediately pushed to the remote branch. Suddenly realized , Oh no! I dont need this change. Now what can you do?&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;git reset --hard HEAD~1 [for deleting that commit from local branch]
git push origin HEAD --force
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Revise commit log&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;git commit --amend
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Ignore local files instead of updating .gitignore&lt;/h2&gt;
&lt;p&gt;Source: &lt;a href=&quot;https://practicalgit.com/blog/make-git-ignore-local-changes-to-tracked-files.html&quot;&gt;https://practicalgit.com/blog/make-git-ignore-local-changes-to-tracked-files.html&lt;/a&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;git update-index --assume-unchanged &amp;lt;file-to-ignore&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now you can go ahead and do whatever you want in that file and it will not show up as a changed file in git.&lt;/p&gt;
&lt;p&gt;This will work unless that file is changed on the remote branch. In that case if you do a pull, you will get an error.&lt;/p&gt;
&lt;p&gt;When that happens you need to tell Git to start caring about the file again, stash it, pull, apply your stashed changes, and tell Git to start ignoring the file again:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# tell Git to stop ignoring this file
$ git update-index --no-assume-unchanged &amp;lt;file-to-ignore&amp;gt;

# stash your local changes to the file
$ git stash &amp;lt;file-to-ignore&amp;gt;

# Pull from remote
$ git pull

# Apply your stashed changes and resolve the possible conflict
$ git stash apply

# Now tell Git to ignore this file again
$ git update-index --assume-unchanged &amp;lt;file-to-ignore&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Pull latest changes from another branch to current branch&lt;/h2&gt;
&lt;h1&gt;My Git Commands Alias&lt;/h1&gt;
&lt;pre&gt;&lt;code&gt;alias gs=&quot;git status &quot;
alias ga=&quot;git add &quot;
alias gb=&quot;git branch &quot;
alias gba=&quot;git branch -a &quot;
alias gbd=&quot;git branch -d &quot;
alias gbr=&quot;git branch -r &quot;
alias gc=&quot;git commit &quot;
alias gd=&quot;git diff &quot;
alias gco=&quot;git checkout &quot;
alias glg=&quot;git log --graph --name-only &quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;h1&gt;Q/A&lt;/h1&gt;
&lt;h2&gt;Why git keep asking user credentials&lt;/h2&gt;
&lt;p&gt;Just need to run commands to store credentials into credential.helper, like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;git config user.name &quot;geekcoding&quot;
git config user.email &quot;geekcoding@users.noreply.github.com&quot;
git pull  &amp;lt;- It might ask credentails at this moment
git config --global credential.helper store
git config --global credential.helper cache

&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Seeing &lt;code&gt;HEAD detached at xxxxxx&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Warning:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Apparently below command will result in losing all the changes made in the detached mode. So be careful when using it.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;git checkout -f master
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Okay, that&apos;s all I have so far. Enjoy!&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Docker Notes</title><link>https://geekcoding101.com/posts/docker-notes</link><guid isPermaLink="true">https://geekcoding101.com/posts/docker-notes</guid><pubDate>Fri, 09 Mar 2018 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Hi there!&lt;/p&gt;
&lt;p&gt;This is yet another note from me ^^&lt;/p&gt;
&lt;p&gt;This is for my notes about Docker. I&apos;ve been dealing with container technologies for years, it&apos;s a good habit to dump all of my notes here.&lt;/p&gt;
&lt;p&gt;I hope you find this useful as well.&lt;/p&gt;
&lt;h1&gt;Build Docker Image&lt;/h1&gt;
&lt;h2&gt;Method 1: Docker build&lt;/h2&gt;
&lt;p&gt;Using dockerfile is the formal way to build a docker image.&lt;/p&gt;
&lt;p&gt;We can define the base image to pull from, copy files inside it, run configuration, specify what process to start with.&lt;/p&gt;
&lt;p&gt;You know I like using Django for projects, here comes a dockerfile from Cookiecutter:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# define an alias for the specific python version used in this file.
FROM python:3.11.6-slim-bullseye as python

# Python build stage
FROM python as python-build-stage

ARG BUILD_ENVIRONMENT=local

# Install apt packages
RUN apt-get update &amp;amp;&amp;amp; apt-get install --no-install-recommends -y \
  # dependencies for building Python packages
  build-essential \
  # psycopg2 dependencies
  libpq-dev

# Requirements are installed here to ensure they will be cached.
COPY ./requirements .

# Create Python Dependency and Sub-Dependency Wheels.
RUN pip wheel --wheel-dir /usr/src/app/wheels  \
  -r ${BUILD_ENVIRONMENT}.txt

# Python &apos;run&apos; stage
FROM python as python-run-stage

ARG BUILD_ENVIRONMENT=local
ARG APP_HOME=/app

ENV PYTHONUNBUFFERED 1
ENV PYTHONDONTWRITEBYTECODE 1
ENV BUILD_ENV ${BUILD_ENVIRONMENT}

WORKDIR ${APP_HOME}

# devcontainer dependencies and utils
RUN apt-get update &amp;amp;&amp;amp; apt-get install --no-install-recommends -y \
  sudo git bash-completion ssh vim
RUN echo &quot;alias ls=&apos;ls -G --color=auto&quot; &amp;gt;&amp;gt; ~/.bashrc
RUN echo &quot;alias ll=&apos;ls -lh --color=auto&quot; &amp;gt;&amp;gt; ~/.bashrc

# Create devcontainer user and add it to sudoers
RUN groupadd --gid 1000 dev-user \
  &amp;amp;&amp;amp; useradd --uid 1000 --gid dev-user --shell /bin/bash --create-home dev-user \
  &amp;amp;&amp;amp; echo dev-user ALL=\(root\) NOPASSWD:ALL &amp;gt; /etc/sudoers.d/dev-user \
  &amp;amp;&amp;amp; chmod 0440 /etc/sudoers.d/dev-user

# Install required system dependencies
RUN apt-get update &amp;amp;&amp;amp; apt-get install --no-install-recommends -y \
  # psycopg2 dependencies
  libpq-dev \
  # Translations dependencies
  gettext \
  # cleaning up unused files
  &amp;amp;&amp;amp; apt-get purge -y --auto-remove -o APT::AutoRemove::RecommendsImportant=false \
  &amp;amp;&amp;amp; rm -rf /var/lib/apt/lists/*

# All absolute dir copies ignore workdir instruction. All relative dir copies are wrt to the workdir instruction
# copy python dependency wheels from python-build-stage
COPY --from=python-build-stage /usr/src/app/wheels  /wheels/

# use wheels to install python dependencies
RUN pip install --no-cache-dir --no-index --find-links=/wheels/ /wheels/* \
  &amp;amp;&amp;amp; rm -rf /wheels/

COPY ./compose/production/django/entrypoint /entrypoint
RUN sed -i &apos;s/\r$//g&apos; /entrypoint
RUN chmod +x /entrypoint

COPY ./compose/local/django/start /start
RUN sed -i &apos;s/\r$//g&apos; /start
RUN chmod +x /start

COPY ./compose/local/django/celery/worker/start /start-celeryworker
RUN sed -i &apos;s/\r$//g&apos; /start-celeryworker
RUN chmod +x /start-celeryworker

COPY ./compose/local/django/celery/beat/start /start-celerybeat
RUN sed -i &apos;s/\r$//g&apos; /start-celerybeat
RUN chmod +x /start-celerybeat

COPY ./compose/local/django/celery/flower/start /start-flower
RUN sed -i &apos;s/\r$//g&apos; /start-flower
RUN chmod +x /start-flower

# copy application code to WORKDIR
COPY . ${APP_HOME}

ENTRYPOINT [&quot;/entrypoint&quot;]
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Method 2: Docker commit&lt;/h2&gt;
&lt;p&gt;Another way is to use &lt;code&gt;docker commit &amp;lt;container_id&amp;gt; &amp;lt;new_image_name&amp;gt;&lt;/code&gt;, it will create a new image based on your existing image in you docker local storage.&lt;/p&gt;
&lt;h2&gt;Export/Import&lt;/h2&gt;
&lt;p&gt;After we have docker images, we usually want to share it with other or transfer to another places, that&apos;s where export/import are used:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;docker save &amp;lt;image_name:version&amp;gt; &amp;gt; exported_file.tar
docker load &amp;lt; exported_file.tar
&lt;/code&gt;&lt;/pre&gt;
&lt;h1&gt;Docker Registry&lt;/h1&gt;
&lt;p&gt;&lt;strong&gt;Environment:&lt;/strong&gt; CentOS 7.2&lt;/p&gt;
&lt;h2&gt;Setup Docker repository&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;sudo tee /etc/yum.repos.d/docker.repo &amp;lt;&amp;lt;-&apos;EOF&apos;
[dockerrepo]
name=Docker Repository
baseurl=https://yum.dockerproject.org/repo/main/centos/7/
enabled=1
gpgcheck=1
gpgkey=https://yum.dockerproject.org/gpg
EOF
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Install and enable docker-registry&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;yum install docker-registry
systemctl enable docker-registry.service
service docker-registry start
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Verify docker-registry service&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Using curl to check&lt;code&gt;curl localhost:5000&lt;/code&gt; You should get results: &lt;code&gt;&quot;\&quot;docker-registry server\&quot;&quot;&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;systemctl status docker-registry&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;Configure storage_path&lt;/h2&gt;
&lt;p&gt;Update local storage path to your specific location in &lt;code&gt;/etc/docker-registry.yml&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;local: &amp;amp;local
     &amp;lt;&amp;lt;: *common
     storage: local
     storage_path: _env:STORAGE_PATH:/data/docker/docker-registry
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then restart: &lt;code&gt;systemctl restart docker-registry.service&lt;/code&gt;&lt;/p&gt;
&lt;h2&gt;Setup client to use the registry&lt;/h2&gt;
&lt;p&gt;Update &lt;code&gt;/etc/sysconfig/docker&lt;/code&gt; to add &lt;code&gt;--insecure-registry your_ip_or_hostname:5000&lt;/code&gt; as below:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# /etc/sysconfig/docker

# Modify these options if you want to change the way the docker daemon runs
OPTIONS=&apos;--insecure-registry your_ip_or_hostname:5000 --selinux-enabled --log-driver=journald&apos;
DOCKER_CERT_PATH=/etc/docker
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Push to the registry&lt;/h2&gt;
&lt;p&gt;In order to have some images to push to the registry, let&apos;s pull from docker.io firstly: &lt;code&gt;docker pull centos&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Please write down the &lt;code&gt;IMAGE ID&lt;/code&gt; for the centos image&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;If you push it to your own registry now, you will get error as blow:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# docker push your_ip_or_hostname:5000/ci
The push refers to a repository [your_ip_or_hostname:5000/ci]
An image does not exist locally with the tag: your_ip_or_hostname:5000/ci
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So you need to create a repo on your private registry then try to push again.&lt;/p&gt;
&lt;p&gt;To do that, you can tag a repo on your private registry and push:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# docker tag the_centos_image_id_you_wrote_down your_ip_or_hostname:5000/centos
[root@geekcoding101 ~]# docker push your_ip_or_hostname:5000/centos
The push refers to a repository [your_ip_or_hostname:5000/centos]
97ca462ad9cc: Image successfully pushed
Pushing tag for rev [the_centos_image_id_you_wrote_down] on {http://your_ip_or_hostname:5000/v1/repositories/centos/tags/latest}
[root@geekcoding101 ~]#
&lt;/code&gt;&lt;/pre&gt;
&lt;h1&gt;Docker Storage&lt;/h1&gt;
&lt;h2&gt;Where does docker store images?&lt;/h2&gt;
&lt;p&gt;Usually is /var/lib/docker/.&lt;/p&gt;
&lt;p&gt;But vary depending on the driver Docker is using for storage.&lt;/p&gt;
&lt;p&gt;You can manually set the storage driver with the -s or --storage-driver= option to the Docker daemon.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;/var/lib/docker/{driver-name}&lt;/code&gt; will contain the driver specific storage for contents of the images.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;/var/lib/docker/graph/&amp;lt;id&amp;gt;&lt;/code&gt; now only contains metadata about the image, in the json and layersize files.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In the case of aufs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;/var/lib/docker/aufs/diff/&amp;lt;id&amp;gt;&lt;/code&gt; has the file contents of the images.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;/var/lib/docker/repositories-aufs&lt;/code&gt; is a JSON file containing local image information. This can be viewed with the command docker images&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h1&gt;Cheat Sheet&lt;/h1&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Command&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;docker version&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;docker info&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;docker images&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;docker rmi &amp;lt;image name&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;docker run -t -i centos&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;docker run -d centos /bin/sh -c &quot;while true; do echo hello world; sleep 1; done&quot;&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;docker stop &amp;lt;container name&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;docker inspect &amp;lt;container name&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;docker tag &amp;lt;tag of container&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;docker logs &amp;lt;container name&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Okay, that&apos;s all from me. Thank you for reading!&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Tmux Notes</title><link>https://geekcoding101.com/posts/tmux-notes</link><guid isPermaLink="true">https://geekcoding101.com/posts/tmux-notes</guid><pubDate>Fri, 24 Jan 2020 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Hi there!&lt;/p&gt;
&lt;p&gt;Today I&apos;d like to share you my notes about tmux!&lt;/p&gt;
&lt;p&gt;Tmux is my favorite terminal multiplexer! Several years ago I didn&apos;t give a **it for people using it! Because I thought that might consume too much of my time to customize. However, one day I was free, then tested the water! I feel like I couldn&apos;t live without it in my coding environment!&lt;/p&gt;
&lt;p&gt;It likes Vim, the learning curve is steep, but once you&apos;re comfortable with it, you will addict to it!&lt;/p&gt;
&lt;p&gt;No more talking, let&apos;s dive into it!&lt;/p&gt;
&lt;h1&gt;Introduction&lt;/h1&gt;
&lt;p&gt;It’s tmux, a so-called terminal multiplexer. Simply speaking, tmux acts as a window manager within your terminal 1 and allows you to create multiple windows and panes within a single terminal window.&lt;/p&gt;
&lt;h1&gt;Pane&lt;/h1&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Shortcut&lt;/th&gt;
&lt;th&gt;Comment&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Pre %&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Splitting panes in left and right&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Pre &quot;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Splitting panes in top and bottom&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Pre &amp;lt;arrow key&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Navigating in panes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;C-d&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Close panes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Pre: swap-pane -s &amp;lt;sid&amp;gt; -t &amp;lt;tid&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Swap sid pane to tid pane&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Pre z&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Make a pane go full screen, vice versa&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Pre C-&amp;lt;arrow key&amp;gt;&lt;/code&gt;   &lt;code&gt;Pre ⌥-&amp;lt;arrow key&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Resize pane in direction of&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h1&gt;Windows&lt;/h1&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Shortcut&lt;/th&gt;
&lt;th&gt;Comment&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Pre c&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Create a new window&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Pre ,&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Rename current window&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Pre x&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Close current window with prompt and deattach&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h1&gt;Sessions&lt;/h1&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Shortcut&lt;/th&gt;
&lt;th&gt;Comment&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Pre :new -s &amp;lt;name&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Create a new session&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Pre C-c&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Create a new session&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Pre $&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Rename current session&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Pre s, then x on the session&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Delete the selected session&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h1&gt;Configuration&lt;/h1&gt;
&lt;p&gt;This is the folder of configuration: &lt;code&gt;~/.tmux&lt;/code&gt;.&lt;br /&gt;
This is the configuratino file: &lt;code&gt;~/.tmux.conf&lt;/code&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Check all configuration: &lt;code&gt;tmux show-options -g&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Reload conf in a session: &lt;code&gt;source-file &amp;lt;tmux.conf&amp;gt;&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Reload conf out of a session: &lt;code&gt;tmux source-file &amp;lt;.tmux.conf&amp;gt;&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;Session Handling&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;tmux ls (Same as: tmux list-sessions)
tmux kill-server                    (You can think this is to kill/remove all sessions)
tmux attach -t 0
tmux rename-session -t 0 &amp;lt;new session name&amp;gt;
tmux new -s &amp;lt;session name&amp;gt;
tmux attach -t &amp;lt;session name&amp;gt;
tmux rename-session -t &amp;lt;old session name&amp;gt; &amp;lt;new session name&amp;gt;
tmux kill-session -t targetSession  (kill the specific session)
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Search&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Pre &lt;code&gt;[&lt;/code&gt; to enter &lt;code&gt;copy mode&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;If you&apos;re using &lt;code&gt;vi&lt;/code&gt; key bindings ( &lt;code&gt;Ctrl-b:set-window-option -g mode-keys vi&lt;/code&gt; ), press &lt;code&gt;/&lt;/code&gt; then type the string to search for and press &lt;code&gt;Enter&lt;/code&gt; . Press &lt;code&gt;n&lt;/code&gt; to search for the same string again. Press &lt;code&gt;Shift-n&lt;/code&gt; for reverse search as in emacs mode. Press &lt;code&gt;q&lt;/code&gt; twice to exit &lt;code&gt;copy mode&lt;/code&gt;. You can use &lt;code&gt;?&lt;/code&gt; to search in the reverse direction.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h1&gt;Pluggins&lt;/h1&gt;
&lt;p&gt;I haven&apos;t explored much in pluggins, but this is the ones I used or come with tmux by default, you can give a try:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;tmux-plugins/tpm&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;tmux-plugins/tmux-sensible&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;tmux-plugins/tmux-resurrect&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Save an entire tmux session: &lt;code&gt;prefix + Control + s&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Restore an entire tmux session: &lt;code&gt;prefix + Control + r&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h1&gt;Use conf from Github&lt;/h1&gt;
&lt;p&gt;I used &lt;a href=&quot;https://github.com/gpakosz/.tmux.git&quot;&gt;https://github.com/gpakosz/.tmux.git&lt;/a&gt;. It&apos;s well customized.&lt;br /&gt;
My &lt;code&gt;.tmux.conf&lt;/code&gt; and &lt;code&gt;.tmux.conf.local&lt;/code&gt; are based on it.&lt;/p&gt;
&lt;h2&gt;Integrate with iTerm2&lt;/h2&gt;
&lt;p&gt;I have iTerm2 installed on my Mac, so I want to integrate tmux with it.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;In iTerm2, &lt;code&gt;General -&amp;gt; Command&lt;/code&gt;:&lt;/li&gt;
&lt;/ol&gt;
&lt;pre&gt;&lt;code&gt;  tmux attach -t base || tmux new -s base
&lt;/code&gt;&lt;/pre&gt;
&lt;ol&gt;
&lt;li&gt;In tmux, &lt;code&gt;Prefix (Control + B) + w&lt;/code&gt;, it will list all windows. Each entry it has a shortcut. You might notice that, after &lt;code&gt;0~9&lt;/code&gt; , it started with &lt;code&gt;M&lt;/code&gt;, like &lt;code&gt;M-i&lt;/code&gt;. The &lt;code&gt;M&lt;/code&gt; means &lt;code&gt;Meta Key&lt;/code&gt;. In iTerm2, you need to set it. I&apos;ve set it as below:&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;img src=&quot;./iTerm-map-meta-key.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;h1&gt;My customization&lt;/h1&gt;
&lt;ol&gt;
&lt;li&gt;Moving windows by adding below settings into &lt;code&gt;~/.tmux.conf.local&lt;/code&gt; (&lt;code&gt;C&lt;/code&gt; means Control, &lt;code&gt;S&lt;/code&gt; means Shift, Left/Right means arrow keys):&lt;/li&gt;
&lt;/ol&gt;
&lt;pre&gt;&lt;code&gt;bind-key -n C-S-Left swap-window -t -1; select-window -t -1
bind-key -n C-S-Right swap-window -t +1; select-window -t +1
&lt;/code&gt;&lt;/pre&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Navigating between panes (NOT windows): &lt;code&gt;Prefix + (→ | ← | ↑ | ↓)&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Find the mouse mode setting in &lt;code&gt;~/.tmux.conf.local&lt;/code&gt; and uncomment it out as below:&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;pre&gt;&lt;code&gt;# start with mouse mode enabled
set -g mouse on
&lt;/code&gt;&lt;/pre&gt;
&lt;ol&gt;
&lt;li&gt;The script to attach or create tmux session. You can configure it at the startup of iterm2. The script will prevent attaching to the tmux session again if it has already been attached in another iTerm2 tab. The reason why I&apos;d like to have this script is that, if not prohibit it, you will see exact same tmux session in your all new iTerm2 tab... then what&apos;s the purpose of creating new iTerm2 tab.&lt;/li&gt;
&lt;/ol&gt;
&lt;pre&gt;&lt;code&gt;#!/bin/zsh

tmux ls|grep kongfu|grep -q attached

if [[ $? != 0 ]] ; then
  tmux attach -t kongfu  ||  tmux new-session -s kongfu
else
  echo &quot;********************************************************************************&quot;
  echo &quot;* Ignore attaching tmux kongfu session as it has been attached already.        *&quot;
  echo &quot;********************************************************************************&quot;
fi
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./tmux_script_iterm2.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;h1&gt;Misc&lt;/h1&gt;
&lt;h2&gt;Resolution problem in multiple monitors&lt;/h2&gt;
&lt;p&gt;Let’s say you’re connecting to a remote server over ssh with Terminal.app. When you “tmux attach” with bigger resolution monitor from smaller one you previously started tmux, it draws dots around the console. It doesn’t fit the new window size.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Resolution:&lt;/strong&gt; You can always choose which client you want to detach from the session by pressing: &lt;code&gt;C-b D&lt;/code&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Okay, that&apos;s all for us today!&lt;br /&gt;
Hope you love it!&lt;/p&gt;
&lt;/blockquote&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Build and Sign RPM package and repo</title><link>https://geekcoding101.com/posts/build-and-sign-rpm-package-and-repo</link><guid isPermaLink="true">https://geekcoding101.com/posts/build-and-sign-rpm-package-and-repo</guid><pubDate>Fri, 22 Jan 2021 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Hi there!&lt;/p&gt;
&lt;p&gt;Welcome to geekcoding101.com!&lt;/p&gt;
&lt;p&gt;I have two decades years of working experiences on Linux. There are many things I have come across, but I want to say, building package for Linux is something you couldn&apos;t avoid at all in your work or study!&lt;/p&gt;
&lt;p&gt;I have summarized the steps/tricks in this article, hope you will find it useful!&lt;/p&gt;
&lt;p&gt;Enjoy!&lt;/p&gt;
&lt;h1&gt;Create unsigned rpm&lt;/h1&gt;
&lt;p&gt;I will first demonstrate how to create unsigned rpm.&lt;/p&gt;
&lt;h2&gt;Create Folder Structure&lt;/h2&gt;
&lt;p&gt;First step is creating folder structure.&lt;/p&gt;
&lt;p&gt;If you don&apos;t specify top_dir in ~/.rpmmacros (It&apos;s a config file), then it will use ~/rpmbuild by default&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;cd ~ 
mkdir rpmbuild 
cd rpmbuild 
mkdir BUILD BUILDROOT SOURCES SRPMS RPMS SPECS
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Create SPEC file for unsigned rpm&lt;/h2&gt;
&lt;p&gt;Now we can work on the spec file &lt;code&gt;SPECS/rpm-no-sig.spec&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# cat SPECS/rpm-no-sig.spec
Name:       rpm-no-sig
Version:    1.0
Release:    1
Summary:    This is a unsigned rpm.

Vendor:     Geekcoding101
License:    Copyright (c) 2021

BuildArch:  noarch
BuildRoot:  %{_tmppath}/%{name}-%{version}

Packager:   Geekcoding101

Source0:    %{name}-%{version}.tar.gz
%define INSTALL_DIR /usr/lib/unsigned-rpm/
%define INSTALL_FILE rpm-helper-unsigned.py

%description
%{Summary}
This package provides rpm gpgcheck demo.

%prep
%setup -q

%install
rm -rf %{buildroot}
install --directory %{buildroot}/%{INSTALL_DIR}
install -m 0755 %{INSTALL_FILE} %{buildroot}/%{INSTALL_DIR}

%clean
rm -rf %{buildroot}

%files
%defattr(-,root,root,-)
%{INSTALL_DIR}
%{INSTALL_DIR}/%{INSTALL_FILE}
%exclude %{INSTALL_DIR}/*.pyc
%exclude %{INSTALL_DIR}/*.pyo

%doc

%changelog
* Sun Jan 21 2021 - Geekcoding101
- Initial commit.

%post
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Create a dummy source file for unsigned rpm&lt;/h2&gt;
&lt;p&gt;Use a dummy py file to be packed into the rpm: &lt;code&gt;rpm-helper-unsigned.py&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;#!/usr/bin/env python

import sys
import os
import re

def main():
    print(&quot;This is from unsigned rpm.&quot;)

if __name__ == &apos;__main__&apos;:
    main()
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Create a folder: &lt;code&gt;mkdir &amp;lt;rpm-name&amp;gt;-&amp;lt;version&amp;gt;&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;For example:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;cd ~
mkdir rpm-no-sig-1.0
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then put &lt;code&gt;rpm-helper-unsigned.py&lt;/code&gt; under it.&lt;/p&gt;
&lt;p&gt;Then make gz file for the folder:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;tar cf rpm-no-sig-1.0.tar rpm-no-sig-1.0
gzip rpm-no-sig-1.0.tar

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You will get file &lt;code&gt;rpm-no-sig-1.0.tar.gz&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Move it to &lt;code&gt;SOURCES&lt;/code&gt; folder.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;When building rpm, it will recognize this gz file and extract it automatically.&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;Build rpm-no-sig.rpm&lt;/h2&gt;
&lt;p&gt;Run command: &lt;code&gt;rpmbuild -ba SPECS/rpm-no-sig.spec&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Example:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# rpmbuild -ba SPECS/rpm-no-sig.spec
warning: bogus date in %changelog: Sun Jan 21 2021 - Geekcoding101
Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.ut6Q5z
+ umask 022
+ cd /root/rpmbuild/BUILD
+ cd /root/rpmbuild/BUILD
+ rm -rf rpm-no-sig-1.0
+ /usr/bin/gzip -dc /root/rpmbuild/SOURCES/rpm-no-sig-1.0.tar.gz
+ /usr/bin/tar -xf -
+ STATUS=0
+ &apos;[&apos; 0 -ne 0 &apos;]&apos;
+ cd rpm-no-sig-1.0
+ /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w .
+ exit 0
Executing(%install): /bin/sh -e /var/tmp/rpm-tmp.UOxKuK
+ umask 022
+ cd /root/rpmbuild/BUILD
+ &apos;[&apos; /root/rpmbuild/BUILDROOT/rpm-no-sig-1.0-1.x86_64 &apos;!=&apos; / &apos;]&apos;
+ rm -rf /root/rpmbuild/BUILDROOT/rpm-no-sig-1.0-1.x86_64
++ dirname /root/rpmbuild/BUILDROOT/rpm-no-sig-1.0-1.x86_64
+ mkdir -p /root/rpmbuild/BUILDROOT
+ mkdir /root/rpmbuild/BUILDROOT/rpm-no-sig-1.0-1.x86_64
+ cd rpm-no-sig-1.0
+ rm -rf /root/rpmbuild/BUILDROOT/rpm-no-sig-1.0-1.x86_64
+ install --directory /root/rpmbuild/BUILDROOT/rpm-no-sig-1.0-1.x86_64//usr/lib/unsigned-rpm/
+ install -m 0755 rpm-helper-unsigned.py /root/rpmbuild/BUILDROOT/rpm-no-sig-1.0-1.x86_64//usr/lib/unsigned-rpm/
+ &apos;[&apos; noarch = noarch &apos;]&apos;
+ case &quot;${QA_CHECK_RPATHS:-}&quot; in
+ /usr/lib/rpm/check-buildroot
+ /usr/lib/rpm/redhat/brp-compress
+ /usr/lib/rpm/redhat/brp-strip /usr/bin/strip
+ /usr/lib/rpm/redhat/brp-strip-comment-note /usr/bin/strip /usr/bin/objdump
+ /usr/lib/rpm/redhat/brp-strip-static-archive /usr/bin/strip
+ /usr/lib/rpm/brp-python-bytecompile /usr/bin/python 1
+ /usr/lib/rpm/redhat/brp-python-hardlink
+ /usr/lib/rpm/redhat/brp-java-repack-jars
Processing files: rpm-no-sig-1.0-1.noarch
warning: File listed twice: /usr/lib/unsigned-rpm/rpm-helper-unsigned.py
Provides: rpm-no-sig = 1.0-1
Requires(rpmlib): rpmlib(CompressedFileNames) &amp;lt;= 3.0.4-1 rpmlib(FileDigests) &amp;lt;= 4.6.0-1 rpmlib(PartialHardlinkSets) &amp;lt;= 4.0.4-1 rpmlib(PayloadFilesHavePrefix) &amp;lt;= 4.0-1
Requires: /usr/bin/env
Checking for unpackaged file(s): /usr/lib/rpm/check-files /root/rpmbuild/BUILDROOT/rpm-no-sig-1.0-1.x86_64
Wrote: /root/rpmbuild/SRPMS/rpm-no-sig-1.0-1.src.rpm
Wrote: /root/rpmbuild/RPMS/noarch/rpm-no-sig-1.0-1.noarch.rpm
Executing(%clean): /bin/sh -e /var/tmp/rpm-tmp.w4KHSg
+ umask 022
+ cd /root/rpmbuild/BUILD
+ cd rpm-no-sig-1.0
+ rm -rf /root/rpmbuild/BUILDROOT/rpm-no-sig-1.0-1.x86_64
+ exit 0
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You will get &lt;code&gt;RPMS/noarch/rpm-no-sig-1.0-1.noarch.rpm&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Check MD5: &lt;code&gt;rpm -Kv &amp;lt;rpm file&amp;gt;&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Example:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# rpm -Kv RPMS/noarch/rpm-no-sig-1.0-1.noarch.rpm
RPMS/noarch/rpm-no-sig-1.0-1.noarch.rpm:
    Header SHA1 digest: OK (84bd6662874a27ccd5cd3247ef7a4107c1919f54)
    MD5 digest: OK (198ab02bd5765c383c57dbe113551af0)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Backup this rpm to somewhere else.&lt;/p&gt;
&lt;h1&gt;Create signed rpm&lt;/h1&gt;
&lt;h2&gt;Create SPEC file for signed rpm&lt;/h2&gt;
&lt;p&gt;The spec file &lt;code&gt;SPECS/rpm-with-sig.spec&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# cat SPECS/rpm-with-sig.spec
Name:       rpm-with-sig
Version:    1.0
Release:    1
Summary:    This is a signed rpm.

Vendor:     Geekcoding101
License:    Copyright (c) 2021

BuildArch:  noarch
BuildRoot:  %{_tmppath}/%{name}-%{version}

Packager:   Geekcoding101

Source0:    %{name}-%{version}.tar.gz
%define INSTALL_DIR /usr/lib/signed-rpm/
%define INSTALL_FILE rpm-helper-signed.py

%description
%{Summary}
This package provides rpm gpgcheck demo.

%prep
%setup -q

%install
rm -rf %{buildroot}
install --directory %{buildroot}/%{INSTALL_DIR}
install -m 0755 %{INSTALL_FILE} %{buildroot}/%{INSTALL_DIR}

%clean
rm -rf %{buildroot}

%files
%defattr(-,root,root,-)
%{INSTALL_DIR}
%{INSTALL_DIR}/%{INSTALL_FILE}
%exclude %{INSTALL_DIR}/*.pyc
%exclude %{INSTALL_DIR}/*.pyo

%doc

%changelog
* Sun Jan 21 2021 - Geekcoding101
- Initial commit.

%post
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Create a dummy source file for signed rpm&lt;/h2&gt;
&lt;p&gt;Use a dummy py file to be packed into the rpm: &lt;code&gt;rpm-helper-signed.py&lt;/code&gt;. You can just reuse the one in above and change the print message accordingly.&lt;/p&gt;
&lt;p&gt;Create a folder: &lt;code&gt;cd ~ &amp;amp;&amp;amp; mkdir &amp;lt;rpm-name&amp;gt;-&amp;lt;version&amp;gt;&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Example: &lt;code&gt;cd ~ &amp;amp;&amp;amp; mkdir rpm-with-sig-1.0&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Move &lt;code&gt;rpm-helper-signed.py&lt;/code&gt; into the folder.&lt;/p&gt;
&lt;p&gt;Also create the gz file with same process.&lt;/p&gt;
&lt;p&gt;Example:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;tar cf rpm-with-sig-1.0.tar rpm-with-sig-1.0
gzip rpm-with-sig-1.0.tar
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You will get file &lt;code&gt;rpm-with-sig-1.0.tar.gz&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Remove &lt;code&gt;rpm-no-sig-1.0.tar.gz&lt;/code&gt; from &lt;code&gt;SOURCES&lt;/code&gt; folder. Move &lt;code&gt;rpm-with-sig-1.0.tar.gz&lt;/code&gt; into &lt;code&gt;SOURCES&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;Generate GPG key and Build rpm-with-sig.rpm&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Generate a gpg key for signing the rpm: &lt;code&gt;gpg --gen-key&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Check the new keys on your system:&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;pre&gt;&lt;code&gt;gpg --fingerprint
gpg --list-keys
&lt;/code&gt;&lt;/pre&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Update ~/.rpmmacros to specify which key to be used for signing: &lt;code&gt;%_gpg_name &amp;lt;secret key&apos;s last 8 digits&amp;gt;&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Build the rpm: &lt;code&gt;rpmbuild -ba SPECS/rpm-with-sig.spec&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;You will get &lt;code&gt;RPMS/noarch/rpm-with-sig-1.0-1.noarch.rpm&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Sign the rpm: &lt;code&gt;rpm --addsign RPMS/noarch/rpm-with-sig-1.0-1.noarch.rpm&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Now you can check MD5 again: &lt;code&gt;rpm -Kv &amp;lt;rpm file&amp;gt;&lt;/code&gt; Example:&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;pre&gt;&lt;code&gt;# rpm -Kv ../rpm-with-sig-1.0-1.noarch.rpm
../rpm-with-sig-1.0-1.noarch.rpm:
    Header V4 RSA/SHA1 Signature, key ID 1ddb39c6: NOKEY
    Header SHA1 digest: OK (64fc89cf3eb3054e6316a77f4b22c183221ab13d)
    V4 RSA/SHA1 Signature, key ID 1ddb39c6: NOKEY
    MD5 digest: OK (fadae5f41bb0c939edbe865a974fce4c)
&lt;/code&gt;&lt;/pre&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;You might see &quot;NOKEY&quot; in above output, because we didn&apos;t import the key into OS.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Export your key: &lt;code&gt;gpg --export -a &amp;lt;last_8_dig_of_your_pub_key&amp;gt; &amp;gt; PUB_KEY_SIGNING_RPM&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Import it into RPM&apos;s database: &lt;code&gt;rpm --import PUB_KEY_SIGNING_RPM&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Now you can check MD5 again: &lt;code&gt;rpm -Kv rpm-with-sig-1.0-1.noarch.rpm&lt;/code&gt; Example:&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;pre&gt;&lt;code&gt;# rpm -Kv ../rpm-with-sig-1.0-1.noarch.rpm
../rpm-with-sig-1.0-1.noarch.rpm:
    Header V4 RSA/SHA1 Signature, key ID 1ddb39c6: OK
    Header SHA1 digest: OK (64fc89cf3eb3054e6316a77f4b22c183221ab13d)
    V4 RSA/SHA1 Signature, key ID 1ddb39c6: OK
    MD5 digest: OK (fadae5f41bb0c939edbe865a974fce4c)
&lt;/code&gt;&lt;/pre&gt;
&lt;h1&gt;Creating repo database/conf for unsigned rpm&lt;/h1&gt;
&lt;h2&gt;Create repo database for the rpm-no-sig rpm&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;cd ~
mkdir unsigned_repo_with_rpm_no_sig
cp rpm-no-sig-1.0-1.noarch.rpm unsigned_repo_with_rpm_no_sig
createrepo --database unsigned_repo_with_rpm_no_sig/
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Example:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# createrepo --database unsigned_repo_with_rpm_no_sig/
Spawning worker 0 with 1 pkgs
Spawning worker 1 with 0 pkgs
Spawning worker 2 with 0 pkgs
Spawning worker 3 with 0 pkgs
Workers Finished
Saving Primary metadata
Saving file lists metadata
Saving other metadata
Generating sqlite DBs
Sqlite DBs complete
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Create repo conf file for unsigned repo&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;# cat /etc/yum.repos.d/rpm-no-sig.repo
[RPM-NO-SIG]
name=rpm no sig repository
baseurl=file:///root/unsigned_repo_with_rpm_no_sig
enabled=1
gpgcheck=0
localpkg_gpgcheck=0
repo_gpgcheck=0
skip_if_unavailable=1
&lt;/code&gt;&lt;/pre&gt;
&lt;h1&gt;Creating signed repo database/conf/gpg for signed rpm&lt;/h1&gt;
&lt;h2&gt;Generate GPG key and Create repo for the rpm-with-sig rpm&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;cd ~
mkdir signed_repo_with_rpm_with_sig
cp rpm-with-sig-1.0-1.noarch.rpm signed_repo_with_rpm_with_sig
createrepo --database signed_repo_with_rpm_with_sig/
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Generate a new gpg key for signing the repo: &lt;code&gt;gpg --gen-key&lt;/code&gt; Create asc file:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;gpg --detach-sign --armor -r Ox&amp;lt;secret key&apos;s last 8 digits fingerprint&amp;gt; signed_repo_with_rpm_with_sig/repodata/repomd.xml
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It will generate &lt;code&gt;signed_repo_with_rpm_with_sig/repodata/repomd.xml.asc&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;Create repo conf file for signed repo&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;# cat /etc/yum.repos.d/rpm-with-sig.repo
[RPM-WITH-SIG]
name=rpm with sig repository
baseurl=file:///root/signed_repo_with_rpm_with_sig
enabled=1
gpgcheck=1
localpkg_gpgcheck=1
repo_gpgcheck=1
skip_if_unavailable=1
&lt;/code&gt;&lt;/pre&gt;
&lt;h1&gt;RPM/YUM relevant GPG knowlege&lt;/h1&gt;
&lt;p&gt;There are two types of GPG keyrings used on RPM-based systems:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;RPM&apos;s GPG keyring. This keyring is used for verifying signatures on RPM packages. This can be check by &lt;code&gt;gpg&lt;/code&gt; without specifying &lt;code&gt;homedir&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;YUM&apos;s GPG keyring. This keyring is used for verifying signatures on repository metadata. There is one keyring per repository on the system. Once scanned the repo by &lt;code&gt;yum repolist&lt;/code&gt;, you could find the gpg folder like this: &lt;code&gt;/var/lib/yum/repos/x86_64/7/&amp;lt;your_repo_name&amp;gt;/gpgdir&lt;/code&gt;. You could use &lt;code&gt;gpg&lt;/code&gt; command &lt;code&gt;ADD/LIST/DELETE&lt;/code&gt; keys as below:&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;pre&gt;&lt;code&gt;sudo gpg --homedir /var/lib/yum/repos/x86_64/7/&amp;lt;your_repo_name&amp;gt;/gpgdir --delete-key &amp;lt;keyid&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;h1&gt;YUM clean up&lt;/h1&gt;
&lt;p&gt;&lt;code&gt;yum clean all&lt;/code&gt; will not remove everything. In order to do a real clean, you could try this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;yum clean all
rm -fr /var/lib/yum/repos/x86_64/7/&amp;lt;your_repo_name&amp;gt;
rm -fr /var/cache/yum/x86_64/7/&amp;lt;your_repo_name&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;h1&gt;Yum commands references&lt;/h1&gt;
&lt;pre&gt;&lt;code&gt;yum install &amp;lt;rpm name&amp;gt;
yum install &amp;lt;path of the rpm&amp;gt;
yum clean all
yum clean metadata
yum-config-manager
yum-config-manager &amp;lt;rpm name&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;h1&gt;Q/A&lt;/h1&gt;
&lt;h2&gt;Can&apos;t remove keys from RPM due to duplicate entries&lt;/h2&gt;
&lt;p&gt;You might hit problem that there are duplicate entries in &lt;code&gt;rpm -qa gpg-pub*&lt;/code&gt; with same fingerprints.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;rpm -e gpg-pubkey-xxxx&lt;/code&gt; can&apos;t remove any.&lt;/p&gt;
&lt;p&gt;You should use &lt;code&gt;rpm -e --all-matches gpg-pubkey-xxxx&lt;/code&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;All right!&lt;br /&gt;
That&apos;s all for my sharing today!&lt;br /&gt;
Hope you find it useful!&lt;br /&gt;
Bye!&lt;/p&gt;
&lt;/blockquote&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>A Tutorial of Angular, Karma and Jasmine</title><link>https://geekcoding101.com/posts/a-tutorial-of-angular-karma-and-jasmine</link><guid isPermaLink="true">https://geekcoding101.com/posts/a-tutorial-of-angular-karma-and-jasmine</guid><pubDate>Fri, 08 Apr 2022 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Hey!&lt;/p&gt;
&lt;p&gt;In my career, I haven&apos;t spent much time on front-end programming. However, I had it now!&lt;br /&gt;
It&apos;s a really exciting journey learning Angular/Karma/Jasmine and I feel like I will probably spent more time on it to gain more depth insights!&lt;/p&gt;
&lt;p&gt;Today&apos;s article is my learning journey on this, hope you will find it as a great tutorial ^^&lt;/p&gt;
&lt;h1&gt;Introductions&lt;/h1&gt;
&lt;h2&gt;Angular Testing Utilities&lt;/h2&gt;
&lt;p&gt;Angular is a TypeScript-based free and open-source web application framework led by the Angular Team at Google and by a community of individuals and corporations. Angular is a complete rewrite from the same team that built AngularJS.&lt;/p&gt;
&lt;p&gt;Angular testing utilities provide you a library to create a test environment for your application.&lt;/p&gt;
&lt;p&gt;Classes such as TestBed and ComponentFixtures and helper functions such as async and fakeAsync are part of the @angular/core/testing package.&lt;/p&gt;
&lt;p&gt;Getting acquainted with these utilities is necessary if you want to write tests that reveal how your components interact with their own template, services, and other components.&lt;/p&gt;
&lt;h3&gt;Ref Links&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;&lt;a href=&quot;https://angular.io/guide/testing&quot;&gt;Angular Testing Guide&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;Karma&lt;/h2&gt;
&lt;p&gt;Karma is a tool that lets you test your application on multiple browsers.&lt;br /&gt;
Karma has plugins for browsers like Chrome, Firefox, Safari, and many others.&lt;br /&gt;
But I prefer using a headless browser for testing.&lt;br /&gt;
A headless browser lacks a GUI, and that way, you can keep the test results inside your terminal.&lt;/p&gt;
&lt;h3&gt;Ref Links&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://karma-runner.github.io/latest/index.html&quot;&gt;Karma github&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://www.npmjs.com/package/karma&quot;&gt;Karma package on npmjs&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;Jasmine&lt;/h2&gt;
&lt;p&gt;Jasmine is a popular behavior-driven testing framework for JavaScript. With Jasmine, you can write tests that are more expressive and straightforward.&lt;/p&gt;
&lt;p&gt;Here is an example to get started:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;it(&apos;should have a defined component&apos;, () =&amp;gt; {
        expect(component).toBeDefined();
});
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Ref Links&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Jasmine_(JavaScript_testing_framework)&quot;&gt;Wiki&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://jasmine.github.io/&quot;&gt;Official website&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h1&gt;Steps&lt;/h1&gt;
&lt;h2&gt;Environment&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;❯ nvm ls
       v12.13.1
-&amp;gt;     v16.14.0
        v17.6.0
default -&amp;gt; 16.14 (-&amp;gt; v16.14.0)
iojs -&amp;gt; N/A (default)
unstable -&amp;gt; N/A (default)
node -&amp;gt; stable (-&amp;gt; v17.6.0) (default)
stable -&amp;gt; 17.6 (-&amp;gt; v17.6.0) (default)
lts/* -&amp;gt; lts/gallium (-&amp;gt; v16.14.0)
lts/argon -&amp;gt; v4.9.1 (-&amp;gt; N/A)
lts/boron -&amp;gt; v6.17.1 (-&amp;gt; N/A)
lts/carbon -&amp;gt; v8.17.0 (-&amp;gt; N/A)
lts/dubnium -&amp;gt; v10.24.1 (-&amp;gt; N/A)
lts/erbium -&amp;gt; v12.22.10 (-&amp;gt; N/A)
lts/fermium -&amp;gt; v14.19.0 (-&amp;gt; N/A)
lts/gallium -&amp;gt; v16.14.0
❯ npm -v
8.3.1
❯ node -v
v16.14.0
❯ ng version

     _                      _                 ____ _     ___
    / \   _ __   __ _ _   _| | __ _ _ __     / ___| |   |_ _|
   / △ \ | &apos;_ \ / _` | | | | |/ _` | &apos;__|   | |   | |    | |
  / ___ \| | | | (_| | |_| | | (_| | |      | |___| |___ | |
 /_/   \_\_| |_|\__, |\__,_|_|\__,_|_|       \____|_____|___|
                |___/

Angular CLI: 13.2.6
Node: 16.14.0
Package Manager: npm 8.3.1
OS: darwin x64

Angular: 13.2.7
... animations, common, compiler, compiler-cli, core, forms
... platform-browser, platform-browser-dynamic, router

Package                         Version
---------------------------------------------------------
@angular-devkit/architect       0.1302.6
@angular-devkit/build-angular   13.2.6
@angular-devkit/core            13.2.6
@angular-devkit/schematics      13.2.6
@angular/cli                    13.2.6
@schematics/angular             13.2.6
rxjs                            7.5.5
typescript                      4.5.5
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;New An Angular Project&lt;/h2&gt;
&lt;p&gt;The developers at Angular have made it easy for us to set up our test environment. To get started, we need to install Angular first.&lt;/p&gt;
&lt;p&gt;I prefer using the Angular-CLI. It&apos;s an all-in-one solution that takes care of creating, generating, building and testing your Angular project.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;ng new Pastebin
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Answer &lt;code&gt;yes&lt;/code&gt; to &lt;code&gt;Would you like to add Angular routing?&lt;/code&gt;; Answer &lt;code&gt;CSS&lt;/code&gt; to &lt;code&gt;Which stylesheet format would you like to use? CSS&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Directory structure:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ ls -l
total 1568
-rw-r--r-- 1 geekcoding101  staff    1054 Apr  8 11:51 README.md
-rw-r--r-- 1 geekcoding101  staff    3051 Apr  8 11:51 angular.json
-rw-r--r-- 1 geekcoding101  staff    1425 Apr  8 11:51 karma.conf.js
drwxr-xr-x  600 geekcoding101  staff   19200 Apr  8 11:53 node_modules
-rw-r--r-- 1 geekcoding101  staff  773285 Apr  8 11:53 package-lock.json
-rw-r--r-- 1 geekcoding101  staff    1071 Apr  8 11:51 package.json
drwxr-xr-x   11 geekcoding101  staff     352 Apr  8 11:51 src
-rw-r--r-- 1 geekcoding101  staff     287 Apr  8 11:51 tsconfig.app.json
-rw-r--r-- 1 geekcoding101  staff     863 Apr  8 11:51 tsconfig.json
-rw-r--r-- 1 geekcoding101  staff     333 Apr  8 11:51 tsconfig.spec.json
❯ tree src
src
├── app
│   ├── app-routing.module.ts
│   ├── app.component.css
│   ├── app.component.html
│   ├── app.component.spec.ts
│   ├── app.component.ts
│   └── app.module.ts
├── assets
├── environments
│   ├── environment.prod.ts
│   └── environment.ts
├── favicon.ico
├── index.html
├── main.ts
├── polyfills.ts
├── styles.css
└── test.ts

3 directories, 14 files
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Launch Angular project:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./default-angular-projecdt-launched-1024x539.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Run karma:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./karma-launched-1024x307.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;You can define a headless browser in your &lt;code&gt;karma.conf.js&lt;/code&gt; as below:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;...
browsers: [&apos;Chrome&apos;,&apos;ChromeNoSandboxHeadless&apos;],

customLaunchers: {
 ChromeNoSandboxHeadless: {
    base: &apos;Chrome&apos;,
    flags: [
      &apos;--no-sandbox&apos;,
      // See https://chromium.googlesource.com/chromium/src/+/lkgr/headless/README.md
      &apos;--headless&apos;,
      &apos;--disable-gpu&apos;,
      // Without a remote debugging port, Google Chrome exits immediately.
      &apos; --remote-debugging-port=9222&apos;,
    ],
  },
},
...
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can refer to &lt;a href=&quot;#toc_Cheatsheet&quot;&gt;Cheatsheet&lt;/a&gt; about how to run unit test and specify which browser to run your test.&lt;/p&gt;
&lt;h2&gt;Add Class&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;ng generate class Pastebin
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;Pastebin.ts&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;export class Pastebin {
 
    id: number;
    title: string;
    language: string;
    paste: string;
 
    constructor(values: Object = {}) {
        Object.assign(this, values);
  }
 
}
 
export const Languages = [&quot;Ruby&quot;,&quot;Java&quot;, &quot;JavaScript&quot;, &quot;C&quot;, &quot;Cpp&quot;];
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;pastebin.spec.ts&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;import { Pastebin } from &apos;./pastebin&apos;;

describe(&apos;Pastebin&apos;, () =&amp;gt; {
  it(&apos;should create an instance of Pastebin&apos;, () =&amp;gt; {
    expect(new Pastebin()).toBeTruthy();
  });
  it(&apos;should accept values&apos;, () =&amp;gt; {
    let pastebin = new Pastebin();
    pastebin = {
      id: 111,
      title: &quot;Hello world&quot;,
      language: &quot;Ruby&quot;,
      paste: &apos;print &quot;Hello&quot;&apos;,
    }
    expect(pastebin.id).toEqual(111);
    expect(pastebin.language).toEqual(&quot;Ruby&quot;);
    expect(pastebin.paste).toEqual(&apos;print &quot;Hello&quot;&apos;);
  });
});
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Setting Up Angular-in-Memory-Web-API&lt;/h2&gt;
&lt;p&gt;We don&apos;t have a server API for the application we are building. Therefore, we are going to simulate the server communication using a module known as &lt;a href=&quot;https://github.com/angular/in-memory-web-api&quot;&gt;InMemoryWebApiModule&lt;/a&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;npm install angular-in-memory-web-api --save
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Add Services&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;ng generate service pastebin
ng generate service in-memory-data
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;PastebinService will host the logic for sending HTTP requests to the server.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;pastebin.service.ts&lt;/code&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;import { Injectable } from &apos;@angular/core&apos;;
import { Pastebin } from &apos;./pastebin&apos;;
import { HttpClient, HttpHeaders } from &apos;@angular/common/http&apos;;
// import { lastValueFrom } from &apos;rxjs&apos;;
import &apos;rxjs/add/operator/toPromise&apos;;

@Injectable()
export class PastebinService {
  // The project uses InMemoryWebApi to handle the Server API. 
  // Here &quot;api/pastebin&quot; simulates a Server API url 
  private pastebinUrl = &quot;api/pastebin&quot;;
  private headers = new Headers({ &apos;Content-Type&apos;: &quot;application/json&quot; });
  constructor(private http: HttpClient) { }

  // getPastebin() performs http.get() and returns a promise
  public getPastebin(): Promise&amp;lt;any&amp;gt; {
    return this.http.get(this.pastebinUrl)
      .toPromise()
      .then(response =&amp;gt; response.json().data)
      .catch(this.handleError);
  }

  private handleError(error: any): Promise&amp;lt;any&amp;gt; {
    console.error(&apos;An error occurred&apos;, error);
    return Promise.reject(error.message || error);
  }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;in-memory-data.service.ts&lt;/code&gt; will implement &lt;code&gt;InMemoryDbService&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;import { InMemoryDbService } from &apos;angular-in-memory-web-api&apos;;
import { Pastebin } from &apos;./pastebin&apos;;

export class InMemoryDataService implements InMemoryDbService {
  createDb() {
    const pastebin:Pastebin[] = [
      { id: 0,  title: &quot;Hello world Ruby&quot;, language: &quot;Ruby&quot;, paste: &apos;puts &quot;Hello World&quot;&apos; },
      {id: 1, title: &quot;Hello world C&quot;, language: &quot;C&quot;, paste: &apos;printf(&quot;Hello world&quot;);&apos;},
      {id: 2, title: &quot;Hello world CPP&quot;, language: &quot;C++&quot;, paste: &apos;cout&amp;lt;&amp;lt;&quot;Hello world&quot;;&apos;},
      {id: 3, title: &quot;Hello world Javascript&quot;, language: &quot;JavaScript&quot;, paste: &apos;console.log(&quot;Hello world&quot;)&apos;}
       
    ];
    return {pastebin};
  }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;h1&gt;Update &lt;code&gt;app.module.ts&lt;/code&gt;&lt;/h1&gt;
&lt;pre&gt;&lt;code&gt;import { NgModule } from &apos;@angular/core&apos;;
import { BrowserModule } from &apos;@angular/platform-browser&apos;;
import { HttpClientModule }    from &apos;@angular/common/http&apos;;

import { AppRoutingModule } from &apos;./app-routing.module&apos;;
import { AppComponent } from &apos;./app.component&apos;;

//In memory Web api to simulate an http server
import { InMemoryWebApiModule } from &apos;angular-in-memory-web-api&apos;;
import { InMemoryDataService }  from &apos;./in-memory-data.service&apos;;

import { PastebinService } from &quot;./pastebin.service&quot;;

@NgModule({
  declarations: [
    AppComponent
  ],
  imports: [
    BrowserModule,
    HttpClientModule,
    InMemoryWebApiModule.forRoot(InMemoryDataService),
    AppRoutingModule
  ],
  providers: [PastebinService],
  bootstrap: [AppComponent]
})
export class AppModule { }

&lt;/code&gt;&lt;/pre&gt;
&lt;h1&gt;Create&lt;/h1&gt;
&lt;h1&gt;Cheatsheet&lt;/h1&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operations&lt;/th&gt;
&lt;th&gt;Command&lt;/th&gt;
&lt;th&gt;Comments&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;New an Angular project&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ng new &amp;lt;project_name&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Generate a new class&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ng generate class &amp;lt;component name&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Generate a new service&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ng generate service &amp;lt;service component name&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Launch Angular project&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ng serve&lt;/code&gt; or &lt;code&gt;npm start&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Launch unit test&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ng test&lt;/code&gt; or &lt;code&gt;npm test&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Launch unit test in specific browser&lt;/td&gt;
&lt;td&gt;&lt;code&gt;npm test -- --browsers ChromeNoSandboxHeadless&lt;/code&gt; or &lt;code&gt;ng test --browsers ChromeNoSandboxHeadless&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Prerequistes&lt;/strong&gt;: You need have &lt;code&gt;ChromeNoSandboxHeadless&lt;/code&gt; defined in your &lt;code&gt;karma.conf.js&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Create component without specs&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ng g component --skip-tests=true &amp;lt;component name&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;You can refer to &lt;a href=&quot;https://stackoverflow.com/questions/40990280/generating-component-without-spec-ts-file-in-angular-2&quot;&gt;Stackoverflow&lt;/a&gt; for more solutions.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Run specific unit test&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ng t -- --include &quot;src/**/your_file_name.component.spec.ts&quot;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Run specific unit test with relative path&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ng test -- --include &quot;relative_path_of_the_spec.ts &quot;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;I&apos;ve tried to use &lt;code&gt;./&lt;/code&gt; start the relative path, it didn&apos;t work. So you&apos;d better to use src/....&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h1&gt;Q/A&lt;/h1&gt;
&lt;h1&gt;References&lt;/h1&gt;
&lt;ol&gt;
&lt;li&gt;&lt;a href=&quot;https://blog.logrocket.com/angular-unit-testing-tutorial-examples/&quot;&gt;Angular unit testing tutorial with examples&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Password Authentication in Node.js: A Step-by-Step Guide</title><link>https://geekcoding101.com/posts/password-authentication-in-node-js-a-step-by-step-guide</link><guid isPermaLink="true">https://geekcoding101.com/posts/password-authentication-in-node-js-a-step-by-step-guide</guid><pubDate>Sun, 23 Jul 2023 00:00:00 GMT</pubDate><content:encoded>&lt;h1&gt;Introduction&lt;/h1&gt;
&lt;p&gt;Password-based authentication remains one of the most common and widely used methods to verify user identity in various online systems. It involves users providing a unique combination of a username and password to gain access to their accounts. Despite its prevalence, password-based authentication comes with security challenges, as weak or compromised passwords can lead to unauthorized access and data breaches.&lt;/p&gt;
&lt;p&gt;In this blog, I will guide you exploring password-based authentication from an easy to medium level, implementing password hashing in a Node.js and TypeScript environment. By the end of this hands-on tutorial, you will have a better understanding of how Password-based authentication works in your applications.&lt;/p&gt;
&lt;h1&gt;Step 1: Setting Up the Node.js and TypeScript Environment&lt;/h1&gt;
&lt;p&gt;To get started, ensure you have Node.js installed on your machine. Create a new project folder and initialize it with a package.json file.&lt;/p&gt;
&lt;p&gt;Here is the steps to show what I’ve done on Mac:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;brew install npm httpie
mkdir password-auth
cd password-auth
npm init -y
npm install -g ts-node
npm install body-parser bcryptjs express --save
npm install @types/bcryptjs @types/express @types/body-parser --save
&lt;/code&gt;&lt;/pre&gt;
&lt;blockquote&gt;
&lt;p&gt;Setting up the programming environment is no doubt crucial, but let’s be honest, it can be a bit daunting. In my tutorials, I will try to make sure not to leave you hanging. I love providing comprehensive explanations, even for the simple tasks or commands. Let’s make this setup process a breeze together! I genuinely hope you find it helpful and that it keeps you smoothly sailing through the tutorial 🤓&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Let’s walk through above commands.&lt;/p&gt;
&lt;p&gt;▹ 1. &lt;code&gt;brew&lt;/code&gt; is the package manager for macOS or Linux. You can find the installation guide easily &lt;a href=&quot;https://brew.sh/&quot;&gt;at their website&lt;/a&gt;. Here we used it to install &lt;code&gt;npm&lt;/code&gt;and &lt;code&gt;httpie&lt;/code&gt; . &lt;code&gt;npm&lt;/code&gt; is &lt;a href=&quot;https://docs.npmjs.com/about-npm&quot;&gt;the JavaScript package manager for Node.js&lt;/a&gt;. We will test the server by using &lt;code&gt;http&lt;/code&gt; command provided by &lt;code&gt;httpie&lt;/code&gt; . The later one is &lt;a href=&quot;https://httpie.io/&quot;&gt;a command-line HTTP client&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;▹ 2. Then we created our project folder &lt;code&gt;password-auth&lt;/code&gt; .&lt;/p&gt;
&lt;p&gt;▹ 3. &lt;code&gt;npm init -y&lt;/code&gt; is to instantly initialize a project. We avoid answering a bunch of questions with &lt;code&gt;-y&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;▹ 4. When you want to use the commands provided by the package in your shell, on the command line or something, use &lt;code&gt;npm&lt;/code&gt; install it globally with &lt;code&gt;-g&lt;/code&gt;, so that its binaries end up in your PATH environment variable. In our case, we need &lt;code&gt;ts-node&lt;/code&gt; from command line.&lt;/p&gt;
&lt;p&gt;▹ 5. &lt;code&gt;ts-node&lt;/code&gt; is a &lt;a href=&quot;https://www.typescriptlang.org/&quot;&gt;TypeScript&lt;/a&gt; execution engine for Node. js. It allows you to run your TypeScript code directly without precompiling your TypeScript code to JavaScript. Typically &lt;code&gt;ts-node&lt;/code&gt; transforms TypeScript to JavaScript in-memory without writing it to disk. You can find more deatils at &lt;a href=&quot;https://typestrong.org/ts-node/&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;▹ 6. &lt;code&gt;express&lt;/code&gt; is a web framework for Node.js to build web application and APIs. Building a backend from-scratch for an application in Node.js can be tedious and time consuming. With &lt;code&gt;express&lt;/code&gt; , you can save time and focus on other important tasks.&lt;/p&gt;
&lt;p&gt;▹ 7. If not installing &lt;code&gt;@types/bcryptjs, @types/express, @types/body-parser&lt;/code&gt; , you will hit below error when running your application:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;img src=&quot;./pwd-auth-01.webp&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;npx stands for Node Package eXecute. It is simply an NPM package runner. NPX is installed automatically with NPM version 5.2. 0 and above.asdf&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;To enable TypeScript support on a Node.js backend API project, you need to set up TypeScript to compile your TypeScript code into JavaScript. Since TypeScript requires the types information for the package, we need to provide that. These @types packages offer type definitions for external modules that lack them. If you’re using an external package that already includes TypeScript definitions, you won’t need to install the corresponding @types package.&lt;/p&gt;
&lt;p&gt;▹ 8. I don’t want to overcrowd this article with setup instructions, but this is the last point. You might have seen &lt;code&gt;-save&lt;/code&gt; and &lt;code&gt;-save-dev&lt;/code&gt; when using &lt;code&gt;npm.&lt;/code&gt; When using &lt;code&gt;-save&lt;/code&gt; , it will put the dependency into core dependency section of package.json, the &lt;code&gt;dependencies&lt;/code&gt; section. The other will put the dependencies to &lt;code&gt;devDependencies&lt;/code&gt; section. A core dependency is any package without which the application cannot perform its intended work. Example: express, body-parser etc.&lt;/p&gt;
&lt;p&gt;Now, here is the output of &lt;code&gt;npm list&lt;/code&gt; and &lt;code&gt;npm list -g&lt;/code&gt; FYI:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./pwd-auth-02.webp&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;h1&gt;Step 2: Creating the Server&lt;/h1&gt;
&lt;p&gt;Create an &lt;code&gt;app.ts&lt;/code&gt; file and set up a basic Express server with routes for user registration and login.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;import express from &apos;express&apos;;
import bodyParser from &apos;body-parser&apos;;
import bcrypt from &apos;bcryptjs&apos;;

const app = express();
const PORT = 3000;

app.use(bodyParser.json());

interface User {
  id: number;
  username: string;
  password: string;
}

let users: User[] = [];

app.post(&apos;/register&apos;, async (req, res) =&amp;gt; {
  try {
    const { username, password } = req.body;
    const salt = await bcrypt.genSalt(10);
    const hashedPassword = await bcrypt.hash(password, salt);
    const newUser: User = {
      id: users.length + 1,
      username,
      password: hashedPassword,
    };
    users.push(newUser);
    res.status(201).json({ message: &apos;User registered successfully!&apos; });
  } catch (error) {
    res.status(500).json({ error: &apos;Internal server error&apos; });
  }
});

app.post(&apos;/login&apos;, async (req, res) =&amp;gt; {
  try {
  const { username, password } = req.body;
  const user = users.find((user) =&amp;gt; user.username === username);
  if (!user) {
    return res.status(404).json({ error: &apos;User not found&apos; });
  }
  const isPasswordValid = await bcrypt.compare(password, user.password);
  if (!isPasswordValid) {
    return res.status(401).json({ error: &apos;Invalid password&apos; });
  }
  res.json({ message: &apos;Login successful!&apos; });
  } catch (error) {
    res.status(500).json({ error: &apos;Internal server error&apos; });
  }
});

app.listen(PORT, () =&amp;gt; {
  console.log(`Server is running on http://localhost:${PORT}`);
});
&lt;/code&gt;&lt;/pre&gt;
&lt;h1&gt;Step 3: Explain the code&lt;/h1&gt;
&lt;p&gt;In this example, we are using an in-memory array to store registered users. In a real-world scenario, you would typically use a database for this purpose.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;/register&lt;/code&gt; route handles user registration. When a user sends a POST request with their desired &lt;code&gt;username&lt;/code&gt; and &lt;code&gt;password&lt;/code&gt;, the server will hash the password using bcrypt and store the new user in the &lt;code&gt;users&lt;/code&gt; array.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;/login&lt;/code&gt; route handles user login. When a user sends a POST request with their &lt;code&gt;username&lt;/code&gt; and &lt;code&gt;password&lt;/code&gt;, the server will find the corresponding user in the &lt;code&gt;users&lt;/code&gt; array, and then use bcrypt to compare the hashed password with the provided password.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;⚠ You might have noticed, why above &lt;code&gt;bcrypt.compare&lt;/code&gt; can compare the password without salt?&lt;br /&gt;
The reason is the salt has been stored as part of the hashed password. When you hash a password using bcrypt, the resulting hash contains both the salt and the password’s cryptographic hash.&lt;br /&gt;
For example, given plain password &lt;code&gt;testpassword&lt;/code&gt;, the hashed password would be &lt;code&gt;$2a$10$34rHf5RmJx1TZmZ7FM5BYe0BPXuw1bs6rYzzqyM7IXgN/VGcQmVMu&lt;/code&gt; .&lt;br /&gt;
So in the above hashed password, there are three fields delimited by &lt;strong&gt;$&lt;/strong&gt; symbol.&lt;/p&gt;
&lt;p&gt;I) First part &lt;strong&gt;$2a$&lt;/strong&gt; identifies the Bcrypt algorithm version used. BCrypt was designed by the OpenBSD people. It was designed to hash passwords for storage in the OpenBSD password file. Hashed passwords are stored with a prefix to identify the algorithm used. BCrypt got the prefix &lt;code&gt;$2$&lt;/code&gt;. So, besides &lt;code&gt;$2a$&lt;/code&gt;, there are &lt;code&gt;$2x$&lt;/code&gt;&lt;strong&gt;,&lt;/strong&gt; &lt;code&gt;$2y$&lt;/code&gt; and &lt;code&gt;$2b$&lt;/code&gt; for BCrypt.&lt;/p&gt;
&lt;p&gt;II) Second part &lt;code&gt;$10$&lt;/code&gt; 10 is the cost factor (nothing but the salt rounds used while creating the salt string) If we do 15 rounds, then the value will be &lt;code&gt;$15$&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;III) Third part is the first &lt;strong&gt;22&lt;/strong&gt; characters which are the salt string. In this case it is &lt;em&gt;34rHf5RmJx1TZmZ7FM5BYe&lt;/em&gt; &lt;em&gt;.&lt;/em&gt; The remaining 31 characters are the hashed password.&lt;/p&gt;
&lt;p&gt;In short, wikipedia gives this formula of bcrypt hashed password:&lt;br /&gt;
&lt;code&gt;$2&amp;lt;a/b/x/y&amp;gt;$[cost]$[22 character salt][31 character hash]&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The remaining string is the hashed password — &lt;strong&gt;&lt;code&gt;Zqlv9ENS7zlIbkMvCSDIv7aup3WNH9W&lt;/code&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;So basically, the saltedHash = salt string + hashedPassword to protect from rainbow table attacks.&lt;/p&gt;
&lt;h1&gt;Step 4: Testing the Server&lt;/h1&gt;
&lt;p&gt;Now that the server is set up, I am going to test it using a tool called &lt;code&gt;httpied&lt;/code&gt;. Make sure you have installed it in previous steps.&lt;/p&gt;
&lt;p&gt;First we need to launch the server from command line (make sure you’re already in &lt;code&gt;password-auth&lt;/code&gt; folder on command line):&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;npx ts-node ./app.ts
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./pwd-auth-03.webp&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Open another terminal, testing register by &lt;code&gt;http&lt;/code&gt; :&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;echo &apos;{&quot;username&quot;: &quot;testuser&quot;, &quot;password&quot;: &quot;testpassword&quot;}&apos; | http POST http://localhost:3000/register
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./pwd-auth-04.webp&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Then it have created a user &lt;code&gt;testuser&lt;/code&gt; in the server with password &lt;code&gt;testpassword&lt;/code&gt; .&lt;/p&gt;
&lt;p&gt;Next, we will test login:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;echo &apos;{&quot;username&quot;: &quot;testuser&quot;, &quot;password&quot;: &quot;testpassword&quot;}&apos; | http POST http://localhost:3000/login
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./pwd-auth-05.webp&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;We can also try a test with wrong password or wrong user name:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./pwd-auth-06-1.webp&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./pwd-auth-07.webp&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;✎ Make sure to send requests with the appropriate JSON data to the appropriate endpoints (&lt;code&gt;/register&lt;/code&gt; and &lt;code&gt;/login&lt;/code&gt;).&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h1&gt;Summary&lt;/h1&gt;
&lt;p&gt;In this blog, we explored password-based authentication in Node.js and TypeScript using bcrypt for secure password hashing. By understanding the fundamentals of bcrypt and its automatic management of salts, we created a simple authentication mechanism that stores and compares hashed passwords securely.&lt;/p&gt;
&lt;p&gt;We learned how to set up a Node.js and TypeScript environment on macOS, implemented an Express server with routes for user registration and login, and utilized bcrypt to securely hash and compare passwords.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The source code of this tutorial has been uploaded to &lt;a href=&quot;https://github.com/geekcoding101/Authentication101&quot;&gt;GeekCoding101 github repo&lt;/a&gt; as well, feel free to take a look.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;In the next blogs, we will continue building upon this foundation of secure authentication. We will explore more authentication methods, such as Basic Authentication, Two-Factor Authentication (2FA), and token-based authentication using JSON Web Tokens (JWT) and so on.&lt;/p&gt;
&lt;p&gt;As we proceed, feel free to ask questions and provide feedback. I am here to support your journey towards building secure and reliable authentication solutions. So, stay tuned for the upcoming blogs 🎉🎉🎉&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>A Deep Dive into HTTP Basic Authentication</title><link>https://geekcoding101.com/posts/a-deep-dive-into-http-basic-authentication</link><guid isPermaLink="true">https://geekcoding101.com/posts/a-deep-dive-into-http-basic-authentication</guid><pubDate>Sun, 01 Oct 2023 00:00:00 GMT</pubDate><content:encoded>&lt;h1&gt;Introduction&lt;/h1&gt;
&lt;p&gt;In this blog post, we will dive into HTTP Basic Authentication, a method rooted in the principles outlined in RFC 7617.&lt;/p&gt;
&lt;p&gt;It’s worth noting that, the RFC specification defines the use of the “Authorization” header in HTTP requests to transmit the credentials. The credentials are typically sent as a Base64-encoded string of the form &lt;code&gt;username:password&lt;/code&gt;. It also describes how servers should respond with appropriate status codes (e.g., 401 Unauthorized) when authentication fails.&lt;/p&gt;
&lt;h1&gt;Step 1: Setting Up the Node.js and TypeScript Environment&lt;/h1&gt;
&lt;p&gt;Please refer to the steps explained in our previous blog post &lt;a href=&quot;/posts/password-authentication-in-node-js-a-step-by-step-guide&quot;&gt;Password Authentication In Node.Js: A Step-By-Step Guide&lt;/a&gt; at &lt;a href=&quot;/posts/password-authentication-in-node-js-a-step-by-step-guide#b71a&quot;&gt;Step 1: Setting Up the Node.js and TypeScript Environment&lt;/a&gt;.&lt;/p&gt;
&lt;h1&gt;Step 2: Creating the Server&lt;/h1&gt;
&lt;h2&gt;usersData.ts&lt;/h2&gt;
&lt;p&gt;In this file, we define a simulated database of users with their hashed passwords using bcrypt. Each user has a &lt;code&gt;username&lt;/code&gt; and a &lt;code&gt;password&lt;/code&gt; field.&lt;/p&gt;
&lt;p&gt;This file acts as our database for the sake of this example.&lt;/p&gt;
&lt;p&gt;The usage of bcrypt also has been explained in &lt;a href=&quot;/posts/password-authentication-in-node-js-a-step-by-step-guide&quot;&gt;Password Authentication In Node.Js: A Step-By-Step Guide&lt;/a&gt; already.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;interface User {
    username: string;
    password: string;
}
  
const users: User[] = [];
  
export default users;
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;&lt;code&gt;basicAuthMiddleware.ts&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;This file contains the basic authentication middleware. The middleware is responsible for authenticating users based on the credentials provided in the &lt;code&gt;Authorization&lt;/code&gt; header. It uses bcrypt to compare the provided password with the hashed password stored in the &lt;code&gt;usersData.ts&lt;/code&gt; file.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;import { Request, Response, NextFunction } from &apos;express&apos;;
import { Buffer } from &apos;buffer&apos;;
import bcrypt from &apos;bcryptjs&apos;;

interface User {
    username: string;
    password: string;
}

const basicAuthMiddleware = (users: User[]) =&amp;gt; async (req: Request, res: Response, next: NextFunction) =&amp;gt; {
    try {
        const authHeader = req.headers.authorization;
        if (!authHeader) {
            // If no authorization header is provided, send a 401 response with the WWW-Authenticate header
            // so that browser will pop up username/password dialog
            res.setHeader(&apos;WWW-Authenticate&apos;, &apos;Basic&apos;);
            return res.status(401).json({ error: &apos;Authorization header missing&apos; });
          }

        const credentials = Buffer.from(authHeader.split(&apos; &apos;)[1], &apos;base64&apos;).toString(&apos;utf-8&apos;);
        const [username, password] = credentials.split(&apos;:&apos;);

        const user = users.find((user) =&amp;gt; user.username === username);
        if (!user) {
            return res.status(401).json({ error: &apos;Invalid username&apos; });
        }

        const isPasswordValid = await bcrypt.compare(password, user.password);
        if (!isPasswordValid) {
            return res.status(401).json({ error: &apos;Invalid password&apos; });
        }

        next();
    } catch (error) {
        res.status(500).json({ error: &apos;Internal server error&apos; });
    }
};

export default basicAuthMiddleware;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In the interest of security, a production-ready authentication system should not provide explicit feedback on whether the username or password is invalid. However, the code examples provided in this article aim to illustrate the principles of Basic Authentication based on RFC 7617 and are intended for educational purposes. They demonstrate the basic mechanics of authentication but may not fully address all security concerns.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;&lt;code&gt;app.ts&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;The &lt;code&gt;app.ts&lt;/code&gt; file will set up the Express server, handles user registration, and protects a route using the basic authentication middleware.&lt;/p&gt;
&lt;p&gt;By implementing authentication in middleware, when the middleware detects invalid credentials, it directly sends the appropriate error response, and the route handler will not be executed.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;app.ts&lt;/code&gt; imports the users&apos; data from &lt;code&gt;usersData.ts&lt;/code&gt; and creates the middleware by passing the users&apos; data as an argument to &lt;code&gt;basicAuthMiddleware&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;/register&lt;/code&gt; will take &lt;code&gt;{&quot;username&quot;: &quot;your_name&quot;, &quot;password&quot;: &quot;your_password&quot;}&lt;/code&gt; as input.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;/protected&lt;/code&gt; endpoint is to verify account credentials.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;import express from &apos;express&apos;;
import bodyParser from &apos;body-parser&apos;;
import bcrypt from &apos;bcryptjs&apos;;
import basicAuthMiddleware from &apos;./basicAuthMiddleware&apos;; // Import the basicAuthMiddleware
import users from &apos;./usersData&apos;; // Import the users data

const app = express();
const PORT = 3001;

app.use(bodyParser.json());

// User registration route
app.post(&apos;/register&apos;, async (req, res) =&amp;gt; {
    try {
        const { username, password } = req.body;

        // Check if the user already exists
        if (users.some((user) =&amp;gt; user.username === username)) {
            return res.status(400).json({ error: &apos;Username already exists&apos; });
        }

        // Hash the password using bcrypt
        const salt = await bcrypt.genSalt(10);
        const hashedPassword = await bcrypt.hash(password, salt);

        // Save the user in the database (in this example, we&apos;re using an in-memory array)
        const newUser = { username, password: hashedPassword };
        users.push(newUser);

        res.status(201).json({ message: &apos;User registered successfully!&apos; });
    } catch (error) {
        res.status(500).json({ error: &apos;Internal server error&apos; });
    }
});

// Create the basicAuthMiddleware with the users array as an argument
const authMiddleware = basicAuthMiddleware(users);

// Use the authMiddleware to protect a route
app.get(&apos;/protected&apos;, authMiddleware, (req, res) =&amp;gt; {
    res.json({ message: &apos;You have successfully accessed the protected route!&apos; });
});

app.listen(PORT, () =&amp;gt; {
    console.log(`Server is running on http://localhost:${PORT}`);
});
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;/p&gt;
&lt;h2&gt;3. Testing the Server&lt;/h2&gt;
&lt;p&gt;Launch the server:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;npx ts-node ./app.ts
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Open another terminal and run below command:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;echo &apos;{&quot;username&quot;: &quot;testuser01&quot;, &quot;password&quot;: &quot;testpassword01&quot;}&apos; | http POST http://localhost:3001/register
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It created a user &lt;code&gt;testuser01&lt;/code&gt; in the server with password &lt;code&gt;testpassword01&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Let’s try to access the protected URI &lt;code&gt;/protected&lt;/code&gt; :&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./http-auth-01.webp&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Goto online Base64 encode/decode website (&lt;a href=&quot;https://emn178.github.io/online-tools/base64_decode.html&quot;&gt;link here&lt;/a&gt;), we can see the decoding results:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./http-auth-02.webp&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;If trying to access from browser, the browser will pop up the username/password verification dialog automatically as below:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./http-auth-03.webp&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;h1&gt;Pros and Cons&lt;/h1&gt;
&lt;h2&gt;Pros:&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;Simplicity: HTTP Basic Authentication is easy to implement and understand. It requires minimal additional overhead for client and server implementations.&lt;/li&gt;
&lt;li&gt;Standardization: It is a standardized authentication method supported by most web browsers and server frameworks.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;Cons:&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Security:&lt;/strong&gt; The credentials are Base64-encoded but not encrypted. This means they can be intercepted if transmitted over an insecure network. It’s crucial to use HTTPS to mitigate this issue.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;No Built-in Password Hashing:&lt;/strong&gt; Basic Authentication does not provide built-in mechanisms for securely storing or hashing passwords. Implementing password hashing and salting is the responsibility of the application developer. Like in our article, we have to implement password hashing.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Limited Features:&lt;/strong&gt; It lacks advanced features like multi-factor authentication (MFA) or token-based authentication, which are often needed for more robust security.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;No Session Management:&lt;/strong&gt; Basic Authentication does not manage user sessions. If session management is required, it needs to be implemented separately.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;User Experience:&lt;/strong&gt; While browsers handle the credential prompt, the user experience can be intrusive, especially for web applications.&lt;/li&gt;
&lt;/ol&gt;
&lt;h1&gt;Summary&lt;/h1&gt;
&lt;p&gt;HTTP Basic Authentication is a straightforward method for securing web resources. It serves well for simple use cases but may not be suitable for applications requiring more advanced security measures.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The source code of this tutorial has been uploaded to &lt;a href=&quot;https://github.com/geekcoding101/Authentication101&quot;&gt;GeekCoding101 github repo&lt;/a&gt; as well, feel free to take a look.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;In the next blog, we will explore more advanced authentication methods, including token-based authentication using JSON Web Tokens (JWT).&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>OAuth 2.0 Grant Types</title><link>https://geekcoding101.com/posts/oauth-grant-types</link><guid isPermaLink="true">https://geekcoding101.com/posts/oauth-grant-types</guid><pubDate>Thu, 30 Nov 2023 00:00:00 GMT</pubDate><content:encoded>&lt;h1&gt;List of Grant Types&lt;/h1&gt;
&lt;p&gt;Below is a table summarizing the different grant types in OAuth 2.0 along with brief descriptions and recommendations regarding their use:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Grant Type&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Recommendation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Authorization Code&lt;/td&gt;
&lt;td&gt;The most commonly used flow in OAuth 2.0. It involves the exchange of an authorization code for an access token. Suitable for server-side web applications and confidential clients.&lt;/td&gt;
&lt;td&gt;Recommended for web applications and confidential clients.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Implicit&lt;/td&gt;
&lt;td&gt;Designed for user-agent-based clients (e.g., browser-based JavaScript applications). Access token is returned directly to the client without an authorization code exchange.&lt;/td&gt;
&lt;td&gt;Deprecated due to security concerns.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Resource Owner Password Credentials&lt;/td&gt;
&lt;td&gt;Allows the client to exchange the user&apos;s username and password for an access token directly. Generally discouraged due to security implications and lack of federation support.&lt;/td&gt;
&lt;td&gt;Not recommended unless unavoidable legacy scenarios.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Client Credentials&lt;/td&gt;
&lt;td&gt;Enables clients to directly exchange client credentials (client ID and client secret) for an access token. Typically used for machine-to-machine communication.&lt;/td&gt;
&lt;td&gt;Recommended for machine-to-machine communication.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Refresh Token&lt;/td&gt;
&lt;td&gt;Allows clients to request a new access token without requiring the user to re-authenticate. It&apos;s not a grant type but rather a mechanism for obtaining new access tokens.&lt;/td&gt;
&lt;td&gt;Recommended for long-lived sessions and offline access.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;It&apos;s important to note that while some grant types may be deprecated or discouraged due to security concerns or lack of use cases, their applicability can vary based on specific requirements and use cases. However, it&apos;s generally recommended to adhere to best practices and use the authorization code flow whenever possible for enhanced security and flexibility.&lt;/p&gt;
&lt;h1&gt;Is PKCE A Grant Type?&lt;/h1&gt;
&lt;p&gt;No, PKCE (Proof Key for Code Exchange) is not a grant type in OAuth 2.0.&lt;/p&gt;
&lt;p&gt;Instead, PKCE is an extension to the OAuth 2.0 authorization code flow, designed to enhance security, particularly in scenarios where the client secret cannot be reliably stored, such as mobile or native applications.&lt;/p&gt;
&lt;p&gt;In the OAuth 2.0 authorization code flow with PKCE, the core grant type remains the &quot;Authorization Code&quot; grant type.&lt;/p&gt;
&lt;p&gt;PKCE introduces additional security measures during the authorization code exchange process by utilizing a dynamically generated secret (the code verifier) and a hash-based challenge (the code challenge). This mechanism helps mitigate certain security risks, such as authorization code interception attacks.&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>OAuth 2.0 Authorization Code Flow</title><link>https://geekcoding101.com/posts/oauth-2-0-authorization-code-flow</link><guid isPermaLink="true">https://geekcoding101.com/posts/oauth-2-0-authorization-code-flow</guid><pubDate>Sun, 03 Dec 2023 00:00:00 GMT</pubDate><content:encoded>&lt;h1&gt;Brief Description&lt;/h1&gt;
&lt;p&gt;The OAuth 2.0 authorization code flow is a secure and widely adopted method for obtaining access tokens to access user resources on behalf of the user.&lt;/p&gt;
&lt;h1&gt;Steps&lt;/h1&gt;
&lt;p&gt;Here&apos;s a summary of the steps in the authorization code flow:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Client Initiation:&lt;/strong&gt; The client application initiates the authorization process by redirecting the user to the authorization server&apos;s authorization endpoint.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;User Authentication and Consent:&lt;/strong&gt; The user is prompted to authenticate with the authorization server and grant permission to the client application to access their resources.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Authorization Code Generation:&lt;/strong&gt; Upon successful authentication and consent, the authorization server generates an authorization code and redirects the user back to the client application along with the authorization code.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Access Token Exchange:&lt;/strong&gt; The client application exchanges the authorization code for an access token by making a request to the authorization server&apos;s token endpoint.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Access Token Usage:&lt;/strong&gt; The client application uses the access token to access the user&apos;s protected resources, such as APIs or data endpoints.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;To clarify, in the authorization code flow, the authorization endpoint issues an authorization code to the client application upon user consent, not an access token directly.&lt;/p&gt;
&lt;h1&gt;Why Authorization Code Flow Not Issue Access Token Directly?&lt;/h1&gt;
&lt;p&gt;The OAuth 2.0 authorization code flow is designed to enhance security and minimize certain risks associated with transmitting sensitive information, such as access tokens, through the user&apos;s browser or mobile device.&lt;/p&gt;
&lt;p&gt;Here are some reasons why the authorization endpoint issues an authorization code instead of an access token directly:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Reduced Exposure of Access Tokens&lt;/strong&gt;: Access tokens are sensitive pieces of information that grant access to the user&apos;s protected resources. By issuing an authorization code instead of an access token directly, the authorization server reduces the exposure of access tokens to potentially compromised user agents (such as web browsers or mobile apps). Since the authorization code is short-lived and can only be exchanged for tokens by the client application with its credentials, the risk associated with the interception of the authorization code is lower than if an access token were transmitted directly.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Separation of Concerns&lt;/strong&gt;: Separating the authorization process into two steps—obtaining the authorization code and exchanging it for an access token—helps clarify the roles and responsibilities of different components in the OAuth 2.0 flow. The authorization endpoint is responsible for handling user consent and authentication, while the token endpoint is responsible for issuing access tokens based on valid authorization codes and client credentials. This separation enhances the security and maintainability of the OAuth 2.0 protocol.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Support for Additional Security Measures&lt;/strong&gt;: The authorization code flow allows for the implementation of additional security measures, such as client authentication at the token endpoint using client credentials (client ID and client secret), which helps verify the identity of the client application before issuing access tokens. This adds an extra layer of security to the token issuance process and helps prevent unauthorized access to user resources.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Overall, by issuing an authorization code instead of an access token directly, the OAuth 2.0 authorization code flow aims to improve security, reduce exposure to sensitive information, and provide a clear separation of concerns in the authentication and authorization process.&lt;/p&gt;
&lt;h1&gt;Benefits of Authorization Code Flow&lt;/h1&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Enhanced Security: By separating the authorization and token exchange steps, it reduces the risk of exposing sensitive information, such as access tokens, during the authorization process.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;User Consent: Users have control over which resources the client application can access, ensuring privacy and security.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Scalability: The authorization code flow is well-suited for a wide range of client types, including web applications, mobile apps, and desktop applications.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Refresh Token Support: It supports the use of refresh tokens, allowing clients to obtain new access tokens without requiring user interaction.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Unlocking Web Security: Master JWT Authentication</title><link>https://geekcoding101.com/posts/unlocking-web-security-master-jwt-authentication</link><guid isPermaLink="true">https://geekcoding101.com/posts/unlocking-web-security-master-jwt-authentication</guid><pubDate>Mon, 15 Jan 2024 00:00:00 GMT</pubDate><content:encoded>&lt;h1&gt;Introduction&lt;/h1&gt;
&lt;p&gt;JSON Web Tokens (JWTs) play a crucial role in web application security. In this blog, we walkthrough the concept of JWT, focusing on the different types of claims, the structure of a JWT, and the algorithms used in signatures, and finally I will implement JWT authentication from scratch in Node.js and Express.js.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;This is my 4th article in Auth101! It’s 2024 now! Looking forward to a wonderful year filled with cool tech updates, new tricks in cyber security, and a bunch of fun coding adventures. I can’t wait to dive into more authentication topics with you all 😃&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h1&gt;Understanding JWT&lt;/h1&gt;
&lt;p&gt;JSON Web Tokens (JWTs) originated as a compact and self-contained way for securely transmitting information between parties as a JSON object. Defined in RFC 7519, JWTs have become a widely adopted standard in the field of web security for their simplicity and versatility.&lt;/p&gt;
&lt;p&gt;A JWT is a string comprising three parts separated by dots (&lt;code&gt;.&lt;/code&gt;): Base64Url encoded header, Base64Url encoded payload, and signature.&lt;/p&gt;
&lt;p&gt;It typically looks like &lt;code&gt;xxxxx.yyyyy.zzzzz&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Let’s deep dive into the three parts: Header, Payload, and Signature.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Header&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The header typically consists of the token type and the signing algorithm, such as HMAC SHA256 or RSA.&lt;/p&gt;
&lt;p&gt;For example:&lt;code&gt;{ &quot;alg&quot;: &quot;HS256&quot;, &quot;typ&quot;: &quot;JWT&quot; }&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Payload&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The payload contains claims, which are statements about an entity and additional metadata. Claims are categorized into &lt;strong&gt;registered&lt;/strong&gt;, &lt;strong&gt;public&lt;/strong&gt;, and &lt;strong&gt;private&lt;/strong&gt; claims. The later two are for custom claims. Public claims are collision-resistant while private claims are subject to possible collisions. In a JWT, a claim appears as a name/value pair where the name is always a string and the value can be any JSON value. For example, the following JSON object contains three claims (&lt;code&gt;sub&lt;/code&gt;, &lt;code&gt;name&lt;/code&gt;, &lt;code&gt;admin&lt;/code&gt;):&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;{
  &quot;sub&quot;: &quot;1234567890&quot;,
  &quot;name&quot;: &quot;Tom Green&quot;,
  &quot;admin&quot;: false
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;&amp;gt; 1. Registered Claims&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;These are predefined claim names with specific meanings recommended for interoperability. For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;iss&lt;/code&gt; (Issuer): Identifies the principal that issued the JWT.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;sub&lt;/code&gt; (Subject): Identifies the principal that is the subject of the JWT.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;aud&lt;/code&gt; (Audience): Identifies the recipients that the JWT is intended for.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;exp&lt;/code&gt; (Expiration Time): Identifies the expiration time on or after which the JWT must not be accepted for processing.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;nbf&lt;/code&gt; (Not Before): Identifies the time before which the JWT must not be accepted for processing.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;iat&lt;/code&gt; (Issued At): Identifies the time at which the JWT was issued.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;jti&lt;/code&gt; (JWT ID): Unique identifier; can be used to prevent the JWT from being replayed (allows a token to be used only once).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You can see a full list of registered claims at the &lt;a href=&quot;https://www.iana.org/assignments/jwt/jwt.xhtml#claims&quot;&gt;IANA JSON Web Token Claims Registry&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&amp;gt; 2. Public Claims&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;These can be defined at will and should be registered in the IANA JSON Web Token Registry or defined as a URI.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&amp;gt; 3. Private Claims&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;These are custom claims created to share information between parties that agree on using them.&lt;/p&gt;
&lt;p&gt;When creating custom claims for JWTs that are specific to your application, it’s often beneficial to use namespacing. This ensures that your claims are unique and do not conflict with other standard or custom claims. Here’s an example of how to implement namespaced custom claims:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;{
  &quot;https://yourdomain.com/claims/user_type&quot;: &quot;admin&quot;,
  &quot;https://yourdomain.com/claims/access_level&quot;: &quot;5&quot;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In this example, custom claims are prefixed with a URL (&lt;code&gt;https://yourdomain.com/claims/&lt;/code&gt;) that is under your control. This URL acts as a namespace, reducing the likelihood of your claims conflicting with others. The claims &lt;code&gt;user_type&lt;/code&gt; and &lt;code&gt;access_level&lt;/code&gt; are specific to the application and are namespaced to ensure uniqueness.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Signature&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The signature is created by taking the encoded header, payload, and a secret, then signing it with the algorithm specified in the header. The signature verifies that the sender of the JWT is who it says it is and ensures that the message wasn’t changed along the way.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Example&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;An example JWT for a user &lt;code&gt;john.doe&lt;/code&gt; using HMAC SHA256 might look like this:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Header: &lt;code&gt;{ &quot;alg&quot;: &quot;HS256&quot;, &quot;typ&quot;: &quot;JWT&quot; }&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Payload: &lt;code&gt;{ &quot;sub&quot;: &quot;john.doe&quot;, &quot;name&quot;: &quot;John Doe&quot;, &quot;admin&quot;: false, &quot;iat&quot;: 1615070800 }&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Signature: Cryptographic signature generated from the header, payload, and secret key. We will see the implementation later.&lt;/li&gt;
&lt;/ul&gt;
&lt;h1&gt;Implementing JWT Authentication&lt;/h1&gt;
&lt;h1&gt;Step 1: Setting Up the Node.js and TypeScript Environment&lt;/h1&gt;
&lt;p&gt;Please refer to the steps explained in our previous blog post &lt;a href=&quot;/posts/password-authentication-in-node-js-a-step-by-step-guide&quot;&gt;Password Authentication In Node.Js: A Step-By-Step Guide&lt;/a&gt; at &lt;a href=&quot;/posts/password-authentication-in-node-js-a-step-by-step-guide#b71a&quot;&gt;Step 1: Setting Up the Node.js and TypeScript Environment&lt;/a&gt;.&lt;/p&gt;
&lt;h1&gt;Step 2: Creating the Server&lt;/h1&gt;
&lt;h2&gt;usersData.ts&lt;/h2&gt;
&lt;p&gt;This is same as the file &lt;a href=&quot;/technologies/security/03-http-basic-authentication/#23f9&quot;&gt;usersData.ts&lt;/a&gt; we talked in &lt;a href=&quot;/technologies/security/03-http-basic-authentication/&quot;&gt;A Deep Dive Into HTTP Basic Authentication&lt;/a&gt; except we added a new field called &lt;code&gt;refreshToken&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;interface User {
  username: string;
  password: string;
  refreshToken?: string;
}

const users: User[] = [];

export default users;
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;jwt.ts&lt;/h2&gt;
&lt;p&gt;I decided to do a custom implementation for generating and verifying JWTs (JSON Web Tokens) without using external libraries like &lt;code&gt;jsonwebtoken&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;It mainly provided below functionalities:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Base64 URL Encoding Function (&lt;strong&gt;&lt;strong&gt;base64UrlEncode&lt;/strong&gt;&lt;/strong&gt;):&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Converts a &lt;code&gt;Buffer&lt;/code&gt; object to a Base64 URL-encoded string. This is necessary because standard Base64 encoding includes characters (&lt;code&gt;+&lt;/code&gt;, &lt;code&gt;/&lt;/code&gt;, and &lt;code&gt;=&lt;/code&gt;) that are not URL-safe. The function replaces these characters to make the string URL-safe.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Signature Function (&lt;strong&gt;&lt;strong&gt;sign&lt;/strong&gt;&lt;/strong&gt;):&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Takes the encoded header, payload, and secret key, then generates a signature using HMAC SHA256.&lt;/li&gt;
&lt;li&gt;The resulting signature is then Base64 URL-encoded.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Generate Access Token Function (&lt;strong&gt;&lt;strong&gt;generateAccessToken&lt;/strong&gt;&lt;/strong&gt;):&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Creates a JWT with a header specifying the algorithm (&lt;code&gt;HS256&lt;/code&gt;) and token type (&lt;code&gt;JWT&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;The payload includes the &lt;code&gt;username&lt;/code&gt; and an &lt;code&gt;exp&lt;/code&gt; (expiration time), set to 15 minutes from the current time.&lt;/li&gt;
&lt;li&gt;The header and payload are Base64 URL-encoded and concatenated with a period, and then signed to generate the JWT.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Generate Refresh Token Function (&lt;strong&gt;&lt;strong&gt;generateRefreshToken&lt;/strong&gt;&lt;/strong&gt;):&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Similar to the access token, but the payload includes a longer expiration time (7 days) and an additional &lt;code&gt;type&lt;/code&gt; field set to &lt;code&gt;&apos;refresh&apos;&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;This token is used to obtain new access tokens without requiring the user to log in again.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Verify Token Function (&lt;strong&gt;&lt;strong&gt;verifyToken&lt;/strong&gt;&lt;/strong&gt;):&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Splits the JWT into its components (header, payload, signature).&lt;/li&gt;
&lt;li&gt;Regenerates the signature based on the header and payload from the token and compares it with the received signature.&lt;/li&gt;
&lt;li&gt;If the signatures match, the function returns the decoded payload; otherwise, it throws an error indicating an invalid token.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;secretKey:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;secretKey&lt;/code&gt; must be kept confidential and secure because it is essentially the &quot;key&quot; that locks and unlocks the JWTs. If an unauthorized party gains access to the &lt;code&gt;secretKey&lt;/code&gt;, they could potentially generate their own valid tokens or tamper with existing tokens, leading to security breaches.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;const secretKey = &apos;your_secret_key&apos;; // Use a strong secret key

const base64UrlEncode = (str: Buffer): string =&amp;gt; {
  return str.toString(&apos;base64&apos;)
    .replace(/\+/g, &apos;-&apos;)
    .replace(/\//g, &apos;_&apos;)
    .replace(/=/g, &apos;&apos;);
};

const sign = (header: string, payload: string, secret: string): string =&amp;gt; {
  const signature = crypto.createHmac(&apos;SHA256&apos;, secret)
    .update(`${header}.${payload}`)
    .digest(&apos;base64&apos;);
  return base64UrlEncode(Buffer.from(signature));
};

export const generateAccessToken = (username: string): string =&amp;gt; {
  const header = { alg: &apos;HS256&apos;, typ: &apos;JWT&apos; };
  const payload = { username, exp: Math.floor(Date.now() / 1000) + (15 * 60) }; // 15 minutes expiry
  const encodedHeader = base64UrlEncode(Buffer.from(JSON.stringify(header)));
  const encodedPayload = base64UrlEncode(Buffer.from(JSON.stringify(payload)));
  const signature = sign(encodedHeader, encodedPayload, secretKey);
  return `${encodedHeader}.${encodedPayload}.${signature}`;
};

export const generateRefreshToken = (username: string): string =&amp;gt; {
  const header = { alg: &apos;HS256&apos;, typ: &apos;JWT&apos; };
  const payload = { username, type: &apos;refresh&apos;, exp: Math.floor(Date.now() / 1000) + (7 * 24 * 60 * 60) }; // 7 days expiry
  const encodedHeader = base64UrlEncode(Buffer.from(JSON.stringify(header)));
  const encodedPayload = base64UrlEncode(Buffer.from(JSON.stringify(payload)));
  const signature = sign(encodedHeader, encodedPayload, secretKey);
  return `${encodedHeader}.${encodedPayload}.${signature}`;
};

export const verifyToken = (token: string): any =&amp;gt; {
  const [encodedHeader, encodedPayload, signature] = token.split(&apos;.&apos;);
  const verifiedSignature = sign(encodedHeader, encodedPayload, secretKey);
  if (verifiedSignature !== signature) {
    throw new Error(&apos;Invalid token&apos;);
  }
  return JSON.parse(Buffer.from(encodedPayload, &apos;base64&apos;).toString());
};
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;jwtAuthMiddleware.ts&lt;/h2&gt;
&lt;p&gt;This middleware is designed to handle JWT authentication for incoming HTTP requests.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;export const jwtAuthMiddleware = (req: Request, res: Response, next: NextFunction) =&amp;gt; {
    try {
        const authHeader = req.headers.authorization;
        if (!authHeader) {
            return res.status(401).json({ error: &apos;Authorization header missing&apos; });
        }

        const token = authHeader.split(&apos; &apos;)[1];
        const decodedUser = verifyToken(token);

        // Create a closure to pass the decoded user
        (req as any).getUser = () =&amp;gt; decodedUser;

        next();
    } catch (error) {
        res.status(401).json({ error: &apos;Invalid token&apos; });
    }
};
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;app.ts&lt;/h2&gt;
&lt;p&gt;Now let’s assemble all in app.ts.&lt;/p&gt;
&lt;p&gt;It includes routes for user registration, deletion, listing, login, token refresh, and accessing a protected route.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;const app = express();
const PORT = 3001;

app.use(bodyParser.json());

// User registration route
app.post(&apos;/register&apos;, async (req: Request, res: Response) =&amp;gt; {
    try {
        const { username, password } = req.body;

        // Check if the user already exists
        if (users.some((user) =&amp;gt; user.username === username)) {
            return res.status(400).json({ error: &apos;Username already exists&apos; });
        }

        // Hash the password using bcrypt
        const salt = await bcrypt.genSalt(10);
        const hashedPassword = await bcrypt.hash(password, salt);

        // Save the user
        const newUser = { username, password: hashedPassword };
        users.push(newUser);

        res.status(201).json({ message: &apos;User registered successfully!&apos; });
    } catch (error) {
        res.status(500).json({ error: &apos;Internal server error&apos; });
    }
});

app.delete(&apos;/user/:username&apos;, (req: Request, res: Response) =&amp;gt; {
  const { username } = req.params;
  const userIndex = users.findIndex(user =&amp;gt; user.username === username);

  if (userIndex === -1) {
      return res.status(404).json({ error: &apos;User not found&apos; });
  }

  users.splice(userIndex, 1);
  res.status(200).json({ message: `User ${username} deleted successfully` });
});

app.get(&apos;/users&apos;, (req: Request, res: Response) =&amp;gt; {
  const usersWithoutPasswords = users.map(({ password, ...userWithoutPassword }) =&amp;gt; userWithoutPassword);
  res.json(usersWithoutPasswords);
});

// User login route
app.post(&apos;/login&apos;, async (req: Request, res: Response) =&amp;gt; {
    try {
        const { username, password } = req.body;
        const user = users.find((user) =&amp;gt; user.username === username);

        if (!user) {
            return res.status(401).json({ error: &apos;Invalid username&apos; });
        }

        const isPasswordValid = await bcrypt.compare(password, user.password);
        if (!isPasswordValid) {
            return res.status(401).json({ error: &apos;Invalid password&apos; });
        }

        const accessToken = generateAccessToken(username);
        const refreshToken = generateRefreshToken(username);
        res.json({ accessToken, refreshToken });
    } catch (error) {
        res.status(500).json({ error: &apos;Internal server error&apos; });
    }
});

// Refresh token route
app.post(&apos;/refresh&apos;, (req: Request, res: Response) =&amp;gt; {
    const { refreshToken } = req.body;
    try {
        const decoded = verifyToken(refreshToken);
        if (decoded.type !== &apos;refresh&apos;) {
            return res.status(401).json({ error: &apos;Invalid refresh token&apos; });
        }
        const newAccessToken = generateAccessToken(decoded.username);
        res.json({ accessToken: newAccessToken });
    } catch (error) {
        res.status(401).json({ error: &apos;Invalid refresh token&apos; });
    }
});

// Protected route
app.get(&apos;/protected&apos;, jwtAuthMiddleware, (req: Request, res: Response) =&amp;gt; {
  const user = (req as any).getUser();
  res.json({ message: &apos;Protected route accessed&apos;, user });
});

app.listen(PORT, () =&amp;gt; {
    console.log(`Server is running on http://localhost:${PORT}`);
});
&lt;/code&gt;&lt;/pre&gt;
&lt;h1&gt;Step 3: Testing the Server&lt;/h1&gt;
&lt;p&gt;Launch the server:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;npx ts-node ./app.ts
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Open another terminal and run below command:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;echo &apos;{&quot;username&quot;: &quot;testuser01&quot;, &quot;password&quot;: &quot;testpassword01&quot;}&apos; | http POST http://localhost:3001/register
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It created a user &lt;code&gt;testuser01&lt;/code&gt; in the server with password &lt;code&gt;testpassword01&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Now we need to login with this user to get &lt;code&gt;accessToken&lt;/code&gt; and &lt;code&gt;refreshToken&lt;/code&gt; :&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# echo &apos;{&quot;username&quot;: &quot;testuser01&quot;, &quot;password&quot;: &quot;testpassword01&quot;}&apos; | http POST http://localhost:3001/login

HTTP/1.1 200 OK
Connection: keep-alive
Content-Length: 365
Content-Type: application/json; charset=utf-8
Date: Mon, 15 Jan 2024 00:53:22 GMT
ETag: W/&quot;16d-IsPcbeAuThyiqhEWd7jZTpqMHlQ&quot;
Keep-Alive: timeout=5
X-Powered-By: Express

{
  &quot;accessToken&quot;: &quot;eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VybmFtZSI6InRlc3R1c2VyMDEiLCJleHAiOjE3MDUyODA5MDJ9.Q0MrUXA4RGI1Smc4SUFmYjV6UFlFRmhWL2NsV20rTHppSlpHemZjSWdsZz0&quot;,
  &quot;refreshToken&quot;: &quot;eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VybmFtZSI6InRlc3R1c2VyMDEiLCJ0eXBlIjoicmVmcmVzaCIsImV4cCI6MTcwNTg4NDgwMn0.TWozZlNvVnhBODJuUjFLc2JVcDRZT2hxZmFSNU9nR01MK3gvNTRnSlNWRT0&quot;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s try to access the protected URI &lt;code&gt;/protected&lt;/code&gt; with the &lt;code&gt;accessToken&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ http GET http://localhost:3001/protected &quot;Authorization:Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VybmFtZSI6InRlc3R1c2VyMDEiLCJleHAiOjE3MDUyODA1NjN9.TXlFR0NUMFZKOXJRVTgvYzFaaGZ5R0JMSTAwdVF3YkNRN1dUa1FQbG9NVT0&quot;
HTTP/1.1 200 OK
Connection: keep-alive
Content-Length: 88
Content-Type: application/json; charset=utf-8
Date: Mon, 15 Jan 2024 00:50:57 GMT
ETag: W/&quot;58-CkXXzga6an0r8ICmEq1Q9VAps9I&quot;
Keep-Alive: timeout=5
X-Powered-By: Express

{
  &quot;message&quot;: &quot;Protected route accessed&quot;,
  &quot;user&quot;: {
    &quot;exp&quot;: 1705280563,
    &quot;username&quot;: &quot;testuser01&quot;
  }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So far so good!&lt;/p&gt;
&lt;p&gt;Now let’s use the &lt;code&gt;refreshToken&lt;/code&gt; to request a new &lt;code&gt;accessToken&lt;/code&gt; :&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ http POST http://localhost:3001/refresh &apos;Content-Type:application/json&apos; &amp;lt;&amp;lt;&amp;lt; &apos;{&quot;refreshToken&quot;:&quot;eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VybmFtZSI6InRlc3R1c2VyMDEiLCJ0eXBlIjoicmVmcmVzaCIsImV4cCI6MTcwNTg4NDQ2M30.MDNHMzI0MEd5SXJCQXRZVCtxVEdCWVVOeDd5Z2F4cXlyaU9xYzB0dTFBWT0&quot;}&apos;

HTTP/1.1 200 OK
Connection: keep-alive
Content-Length: 171
Content-Type: application/json; charset=utf-8
Date: Mon, 15 Jan 2024 00:53:09 GMT
ETag: W/&quot;ab-TBXJ1UxTtbjzvvrwtTfhDXlSc1Q&quot;
Keep-Alive: timeout=5
X-Powered-By: Express

{
  &quot;accessToken&quot;: &quot;eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VybmFtZSI6InRlc3R1c2VyMDEiLCJleHAiOjE3MDUyODA4ODl9.WkQzeWJTMU80R0dycFhNc1ZLWTVTZjBjbGZwTEpwMi96RFI1Mnh6ZkNIWT0&quot;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;By the way, there are online JWT encoder/decoder you can use, for example &lt;a href=&quot;https://www.jstoolset.com/jwt&quot;&gt;https://www.jstoolset.com/jwt&lt;/a&gt;. Just paste the JWT string and it can help you to decode &lt;code&gt;header&lt;/code&gt; and &lt;code&gt;payload&lt;/code&gt; .&lt;/p&gt;
&lt;h1&gt;Summary&lt;/h1&gt;
&lt;p&gt;And there we have it — our exploration of JWTs is at a pause. I hope this journey has shed some light on the inner workings of JWT authentication and its role in securing web applications. But as they say, every end is a new beginning.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The source code of this tutorial has been uploaded to &lt;a href=&quot;https://github.com/geekcoding101/Authentication101&quot;&gt;GeekCoding101 github repo&lt;/a&gt; as well, feel free to star my repo and explore.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Now, let’s ponder a common scenario: You log into a website and stay there, browsing around. Ever wondered how the server keeps recognizing you as you navigate from page to page? How does it ensure you still have access to all those protected areas without asking you to log in again and again? This isn’t just a matter of convenience; it’s a crucial aspect of user experience and security.&lt;/p&gt;
&lt;p&gt;Is it something to do with sessions or cookies, perhaps? Well, that’s precisely the topic we’ll delve into in our next blog. We’ll unravel the mysteries of session management, cookies, and how they work together to maintain your authenticated state in a web application. It’s an essential piece of the puzzle for understanding comprehensive web security.&lt;/p&gt;
&lt;p&gt;So stay tuned for our next discussion where we decode the secrets behind seamless and secure browsing experiences. Until then, happy coding, and keep those applications secure!&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Mastering Openssl Command and NSS Database Management</title><link>https://geekcoding101.com/posts/mastering-openssl-command-and-nss-database-management</link><guid isPermaLink="true">https://geekcoding101.com/posts/mastering-openssl-command-and-nss-database-management</guid><pubDate>Fri, 05 Apr 2024 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Greetings to all you geeks out there!&lt;/p&gt;
&lt;p&gt;It&apos;s a pleasure to have you here at geekcoding101.com!&lt;/p&gt;
&lt;p&gt;With almost 20 years immersed in the vibrant world of Linux and security domain, I&apos;ve encountered a myriad of tools and technologies that have shaped my journey. Today, I&apos;m excited to introduce you OpenSSL and Certutil—two indispensable utilities that play pivotal roles in managing digital certificates and encryption. Whether you&apos;re safeguarding your web servers or securing communications, understanding these tools is crucial. I&apos;ve distilled my insights and tips into this post, aiming to arm you with the knowledge to leverage these powerful utilities effectively.&lt;/p&gt;
&lt;p&gt;Enjoy!&lt;/p&gt;
&lt;h1&gt;Openssl&lt;/h1&gt;
&lt;p&gt;OpenSSL is an open-source software library that provides a robust, commercial-grade, and full-featured toolkit for SSL and TLS protocols, as well as a general-purpose cryptography library. It is widely used by internet servers, including the majority that implement secure web (HTTPS) connections, as well as in countless other security-sensitive applications. Here are some key aspects of OpenSSL:&lt;/p&gt;
&lt;h2&gt;Core Features&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Encryption&lt;/strong&gt;: Offers cryptographic algorithms for encrypting data, ensuring that information can be transmitted or stored securely. This includes algorithms like AES, DES, RC4, and more.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;SSL/TLS Protocols&lt;/strong&gt;: Facilitates secure communications over computer networks against eavesdropping, tampering, and message forgery. OpenSSL includes implementations of the SSL and TLS protocols to secure network communications.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Cryptographic Hash Functions&lt;/strong&gt;: Supports hash functions like SHA-1, SHA-256, and MD5, used for creating message digests that ensure the integrity of data.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Digital Certificates&lt;/strong&gt;: Manages X.509 certificates which are essential for establishing SSL/TLS connections. OpenSSL can generate certificate signing requests (CSRs), create certificates, and manage certificate chains.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Public Key Infrastructure (PKI)&lt;/strong&gt;: Supports PKI essentials for managing public and private keys, including generating key pairs, signing certificates, and more.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Query Information&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Query on Private Key&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;openssl rsa -in privatekey.pem -check&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Query All Information&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;openssl x509 -in certificate.pem -text -noout&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Query Subject&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;openssl x509 -in certificate.pem -subject -noout&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Query Validity&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;openssl x509 -in certificate.pem -dates -noout&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Query Purpose&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;openssl x509 -in certificate.pem -purpose -noout&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Certificate purposes:
SSL client : No
SSL client CA : Yes
SSL server : No
SSL server CA : Yes
Netscape SSL server : No
Netscape SSL server CA : Yes
S/MIME signing : No
S/MIME signing CA : Yes
S/MIME encryption : No
S/MIME encryption CA : Yes
CRL signing : No
CRL signing CA : Yes
Any Purpose : Yes
Any Purpose CA : Yes
OCSP helper : Yes
OCSP helper CA : Yes
Time Stamp signing : No
Time Stamp signing CA : Yes
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;Download Cert from Remote Server&lt;/strong&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;openssl s_client -ssl3 -showcerts -debug -connect ldap.XXXX.com:636 &amp;lt; /dev/null &amp;gt; /tmp/ldap.out 2&amp;gt;&amp;amp;1
sed -n &apos;/BEGIN CERTIFICATE/,/END CERTIFICATE/p&apos; /tmp/ldap.out  &amp;gt; /tmp/ldap.pem
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;PKCS#12 (PFX) File Management&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Convert PFX to PEM&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;openssl pkcs12 -in filename.pfx -out certificate.pem -nodes&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Print Some Info About a PKCS#12 File&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;openssl pkcs12 -info -in filename.pfx&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Print Some Info About a PKCS#12 File in Legacy Mode&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;openssl pkcs12 -info -in filename.pfx -legacy&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Extract Only Client Certificates + Key&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;openssl pkcs12 -in filename.pfx -clcerts -out clientcert.pem&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Extract Only Client Cert&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;openssl pkcs12 -in filename.pfx -clcerts -nokeys -out clientcert.pem&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Extract Unencrypted Key File from PFX&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;openssl pkcs12 -in filename.pfx -nocerts -nodes -out privatekey.pem&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Extract CA Cert from PFX&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;openssl pkcs12 -in filename.pfx -cacerts -nokeys -out cacert.pem&lt;/code&gt;&lt;/p&gt;
&lt;h1&gt;NSS Database Management&lt;/h1&gt;
&lt;p&gt;The NSS (Network Security Services) Database is a set of libraries designed to support cross-platform development of security-enabled client and server applications. Applications can use NSS for SSL/TLS, PKI (Public Key Infrastructure) certificate management, cryptographic operations, and other security standards. The NSS Database, specifically, is a critical component for managing certificates, keys, and other security assets.&lt;/p&gt;
&lt;h2&gt;Key Features of the NSS Database&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Certificate and Key Storage&lt;/strong&gt;: It stores and manages SSL/TLS certificates, private keys, and trust settings in a secure, encrypted database format. This storage is essential for applications needing to establish secure connections, authenticate themselves or their users, and ensure data integrity and confidentiality.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Cross-Platform Support&lt;/strong&gt;: NSS provides a platform-independent way to manage security assets, making it suitable for a wide range of operating systems and environments.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Security&lt;/strong&gt;: The database is designed with a strong focus on security, including support for various encryption algorithms and mechanisms to protect sensitive information.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;PKI Support&lt;/strong&gt;: It supports a comprehensive range of PKI standards, allowing applications to perform tasks such as certificate signing, issuance, and revocation checking.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Components of the NSS Database&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;CertDB&lt;/strong&gt;: A database for storing certificates, including user, server, and CA (Certificate Authority) certificates.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;KeyDB&lt;/strong&gt;: A database for storing private keys associated with the certificates.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;SecmodDB&lt;/strong&gt;: A database for managing PKCS#11 module configurations. PKCS#11 modules are used to interface with cryptographic tokens like smart cards or hardware security modules (HSMs).&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Management Tools&lt;/h2&gt;
&lt;p&gt;NSS comes with several command-line tools for managing the NSS Database, including:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;certutil&lt;/strong&gt;: For managing certificates and keys within the database.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;pk12util&lt;/strong&gt;: For importing and exporting certificates and keys in PKCS#12 format.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;modutil&lt;/strong&gt;: For managing PKCS#11 modules.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Usage&lt;/h2&gt;
&lt;p&gt;NSS Databases are often used in web browsers (like Mozilla Firefox), email clients, and other networked applications requiring secure communication. By managing cryptographic keys and certificates, the NSS Database plays a crucial role in enabling secure internet communications and data protection efforts across various applications.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Import Cert/Key (PEM) into NSS&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;certutil -A -n &quot;certificate name&quot; -t &quot;TCu,Cu,Tu&quot; -i certificate.pem -d sql:/path/to/nssdb&lt;/code&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;-t trustargs&lt;br /&gt;
Specify the trust attributes to modify in an existing certificate or to apply to a certificate when creating it or adding it to a database. There are three&lt;br /&gt;
available trust categories for each certificate, expressed in the order SSL, email, object signing for each trust setting. In each category position, use none,&lt;br /&gt;
any, or all of the attribute codes:&lt;/p&gt;
&lt;p&gt;· p - Valid peer&lt;/p&gt;
&lt;p&gt;· P - Trusted peer (implies p)&lt;/p&gt;
&lt;p&gt;· c - Valid CA&lt;/p&gt;
&lt;p&gt;· C - Trusted CA (implies c)&lt;/p&gt;
&lt;p&gt;· T - trusted CA for client authentication (ssl server only)&lt;/p&gt;
&lt;p&gt;The attribute codes for the categories are separated by commas, and the entire set of attributes enclosed by quotation marks. For example:&lt;/p&gt;
&lt;p&gt;-t &quot;TC,C,T&quot;&lt;/p&gt;
&lt;p&gt;Use the -L option to see a list of the current certificates and trust attributes in a certificate database.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Note that the output of the -L option may include &quot;u&quot; flag, which means that there is a private key associated with the certificate. It is a dynamic flag and you cannot set it with certutil.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;pre&gt;&lt;code&gt;certutil
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;Import PFX into NSS DB&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;pk12util -i filename.pfx -d sql:/path/to/nssdb&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Export PEM from NSS DB&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;certutil -L -n &quot;certificate name&quot; -d sql:/path/to/nssdb -a &amp;gt; certificate.pem&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;List Keys from NSS DB&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;certutil -K -d sql:/path/to/nssdb&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Remove Key from NSS DB&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;certutil -D -n &quot;certificate name&quot; -d sql:/path/to/nssdb&lt;/code&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Cool! I believe that&apos;s a lot for today&apos;s topic!&lt;br /&gt;
Let&apos;s wrap up and see you next time!&lt;/p&gt;
&lt;/blockquote&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Crafting A Bash Script with Tmux</title><link>https://geekcoding101.com/posts/crafting-a-bash-script-with-tmux</link><guid isPermaLink="true">https://geekcoding101.com/posts/crafting-a-bash-script-with-tmux</guid><pubDate>Sun, 07 Apr 2024 00:00:00 GMT</pubDate><content:encoded>&lt;h1&gt;The Background...&lt;/h1&gt;
&lt;p&gt;I have Django/Vue development environment running locally.&lt;/p&gt;
&lt;p&gt;To streamline my Django development, I typically open six tmux windows 😎 :&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Celery window - It also check and start necessary local services, like mailpit and redis, then fianlly start Celery.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Flower window - Start Flower&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Django window - Start Django runserver&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Django manager shell window - For Django manager operations&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Heroku window - Checking Heroku status and commit and other Heroku operations&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Vue window - Start npm run serve or build&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;I used one Tmux session to hold all above.&lt;/p&gt;
&lt;p&gt;However, my laptop sometimes needs to reboot, after reboot, all of my windows are gone 😓&lt;/p&gt;
&lt;p&gt;I have configured tmux-resurrect and tmux-continuum to try to handle this scenario, but they couldn&apos;t re-run those commands even they could restore the windows correctly.&lt;/p&gt;
&lt;p&gt;Let me show you the screenshots.&lt;/p&gt;
&lt;h1&gt;The problem...&lt;/h1&gt;
&lt;p&gt;Typically, my development windows look like this:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./craft-script-with-tmux-01.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;As you see, the services are running within the respective windows.&lt;/p&gt;
&lt;p&gt;If I save them with tmux-resurrect, after reboot, of course tmux-resurrect and tmux-continuum could restore them, but services and all environment variables are gone.&lt;/p&gt;
&lt;p&gt;To simulate, let me kill all sessions in tmux, check the output:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./craft-script-with-tmux-02.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Now start tmux again, here are the status I can see, tmux restored the previous saved windows:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./craft-script-with-tmux-03-tmux-status-scaled.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Let&apos;s check the window now:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./craft-script-with-tmux-04-no-service-scaled.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;None of the services is running 🙉&lt;/p&gt;
&lt;h1&gt;The Complain...&lt;/h1&gt;
&lt;blockquote&gt;
&lt;p&gt;As the supreme overlord of geekcoding101.com, I simply cannot let such imperfection slide.&lt;br /&gt;
Not on my watch.&lt;br /&gt;
Nope, not happening.&lt;br /&gt;
This ain&apos;t it, chief.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Okay, let&apos;s fix it!&lt;/p&gt;
&lt;h1&gt;The Fix...&lt;/h1&gt;
&lt;p&gt;&lt;img src=&quot;./jim-carrey-typing.gif&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./8-hours-later.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./jim-carrey-typing.gif&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;....&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./2-days-later.gif&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./i-am-back-01.gif&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Okay! I wrote a script.. oh no! Two scripts!&lt;/p&gt;
&lt;p&gt;One is called start_tmux_dev_env.sh to create all windows, it will invoke prepare_dev_env.sh which export functions to initialize environment variables in specific windows.&lt;/p&gt;
&lt;p&gt;A snippet of start_tmux_dev_env.sh:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;#!/bin/bash

# Please note: don&apos;t use dot in session name.
SESSION_NAME=&quot;matrixlink_ai&quot;

# Check if the tmux session already exists
tmux has-session -t $SESSION_NAME 2&amp;gt;/dev/null

if [ $? != 0 ]; then
  # Create a new detached tmux session named matrixlink.ai
  tmux new-session -d -s $SESSION_NAME

  # Set up the &apos;celery&apos; window
  tmux rename-window -t $SESSION_NAME &apos;celery&apos;
  echo &quot;Starting celery window...&quot;
  echo &quot;Sleeping 5s to wait window celery finish initialization.....&quot;
  sleep 5
  echo &quot;Checking psql in window celery...&quot;
  tmux send-keys -t $SESSION_NAME &apos;psql -h localhost -p 5432 -d matrixlink_ai&apos; C-m
  sleep 2
  tmux send-keys -t $SESSION_NAME &apos;\q&apos; C-m
  sleep 1

  # Can&apos;t use ENVIRONMENT variable for the script path be sourced.
  # Remember to put below command to background, otherwise it will wait here forever.
  tmux send-keys -t $SESSION_NAME &apos;. ${YOUR_PATH_TO_PROJECT}/matrixlink.ai/utils/prepare_dev_env.sh &amp;amp;&amp;amp; setup_celery_window&apos; C-m &amp;amp;

  echo &quot;Starting flower window...&quot;
  tmux new-window -t $SESSION_NAME -n &apos;flower&apos;
  echo &quot;Sleeping 5s to wait window flower finish initialization.....&quot;
  sleep 5
  # $SESSION_NAME:flower is to specify the window
  # $SESSION_NAME.flower is to specify the pane
  tmux send-keys -t $SESSION_NAME:flower &apos;. ${YOUR_PATH_TO_PROJECT}/matrixlink.ai/utils/prepare_dev_env.sh &amp;amp;&amp;amp; setup_flower_window&apos; C-m &amp;amp;

  echo &quot;Starting nvm window...&quot;
  ...

  echo &quot;Starting Django window...&quot;
  ...

  echo &quot;Starting Django manager window...&quot;
  ...

  echo &quot;Starting Heroku window...&quot;
  ...
fi

# Attach to the tmux session
tmux attach -t $SESSION_NAME
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The prepare_dev_env.sh looks like:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;#!/bin/sh

WORKDIR=&quot;${YOUR_PATH_TO_PROJECT}/github/matrixlink.ai/&quot;
CONDA_ENV=&quot;matrixlinkai.django&quot;

setup_celery_window() {
  conda activate ${CONDA_ENV}
  cd $WORKDIR
  brew services start mailpit
  brew services start redis
  export REDIS_URL=redis://localhost:6379/0
  export USE_DOCKER=no
  export CELERY_CONFIG_TASK_ALWAYS_EAGER=yes
  celery -A config.celery_app worker --loglevel=info
}

setup_flower_window() {
  conda activate ${CONDA_ENV}
  cd $WORKDIR
  export CELERY_BROKER_URL=$(echo &quot;REDIS_URL&quot;)
  export REDIS_URL=redis://localhost:6379/0
  export CELERY_BROKER_URL=$REDIS_URL
  export CELERY_FLOWER_USER=debug
  export CELERY_FLOWER_PASSWORD=debug
  export USE_DOCKER=no
  export CELERY_CONFIG_TASK_ALWAYS_EAGER=yes
  celery -A config.celery_app -b ${CELERY_BROKER_URL} flower --basic_auth=&quot;${CELERY_FLOWER_USER}:${CELERY_FLOWER_PASSWORD}&quot;
}

setup_django_window() {
  conda activate ${CONDA_ENV}
  cd $WORKDIR
  export CELERY_BROKER_URL=$(echo &quot;REDIS_URL&quot;)
  export REDIS_URL=redis://localhost:6379/0
  export CELERY_BROKER_URL=$REDIS_URL
  export CELERY_FLOWER_USER=debug
  export CELERY_FLOWER_PASSWORD=debug
  export USE_DOCKER=no
  export EMAIL_HOST=localhost
  export CELERY_CONFIG_TASK_ALWAYS_EAGER=yes
  
  # Please export sensitive information manually, like OPENAI key

  # You need to manually replace ${DB_USERNAME} here if not yet set environment variable.
  export DATABASE_URL=postgres://${DB_USERNAME}@127.0.0.1:5432/matrixlink_ai
  python manage.py migrate
  echo &quot;Sleeping 10s to wait npm to start in another window...&quot;
  sleep 10
  python manage.py runserver 0.0.0.0:8000
}

setup_django_manager_window() {
  conda activate ${CONDA_ENV}
  cd $WORKDIR
  export USE_DOCKER=no
  # You need to manually replace ${DB_USERNAME} here if not yet set environment variable.
  export DATABASE_URL=postgres://${DB_USERNAME}@127.0.0.1:5432/matrixlink_ai
  export REDIS_URL=redis://localhost:6379/0
  export CELERY_CONFIG_TASK_ALWAYS_EAGER=yes
  python manage.py shell
}

setup_nvm_window() {
  conda activate ${CONDA_ENV}
  cd $WORKDIR/frontend
  npm run serve
}
&lt;/code&gt;&lt;/pre&gt;
&lt;h1&gt;The End...&lt;/h1&gt;
&lt;p&gt;Now, after reboot, I can just invoke script start_tmux_dev_env.sh and it will spin up all windows for me in seconds!&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://tenor.com/view/ron-swanson-parks-and-rec-proud-proud-of-you-gif-4317158&quot;&gt;I&apos;M Really Pround Of You GIF&lt;/a&gt;from &lt;a href=&quot;https://tenor.com/search/ron+swanson-gifs&quot;&gt;Ron Swanson GIFs&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&amp;lt;script type=&quot;text/javascript&quot; async src=&quot;https://tenor.com/embed.js&quot;&amp;gt;&amp;lt;/script&amp;gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I&apos;ve recorded a video about how it looks like when running the script, please check out at my post &lt;a href=&quot;/posts/terminal-mastery-crafting-a-productivity-environment-with-iterm-tmux-and-beyond&quot;&gt;Terminal Mastery: Crafting A Productivity Environment With ITerm, Tmux, And Beyond&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Thanks for watching!&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Vue: Secrets to Resolving Empty index.html in WebHistory</title><link>https://geekcoding101.com/posts/vue-secrets-to-resolving-empty-index-html-in-webhistory</link><guid isPermaLink="true">https://geekcoding101.com/posts/vue-secrets-to-resolving-empty-index-html-in-webhistory</guid><pubDate>Mon, 08 Apr 2024 00:00:00 GMT</pubDate><content:encoded>&lt;h1&gt;Greetings&lt;/h1&gt;
&lt;p&gt;Hi there!&lt;/p&gt;
&lt;p&gt;I was trying some new stuff about VUE recently.&lt;/p&gt;
&lt;p&gt;I downloaded a free version of VUE Argon dashboard code and tried to compile it locally.&lt;/p&gt;
&lt;p&gt;It&apos;s straghtforward:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;nvm use lts/iron
npm install
npm run build
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./Solving-Empty-index.html-Puzzle-01-scaled.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Then I got the dist folder:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./Solving-Empty-index.html-Puzzle-02.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;h1&gt;Interesting...&lt;/h1&gt;
&lt;p&gt;Then I double clicked the index.html, expecting it will display the beautiful landing page, but it didn&apos;t happen...&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./Solving-Empty-index.html-Puzzle-03.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;This is strange... What went wrong?&lt;/p&gt;
&lt;p&gt;I tried npm run serve, it works well, I can see the portal and navigate between pages without issues.&lt;/p&gt;
&lt;h1&gt;I must fix this! Should be quick!&lt;/h1&gt;
&lt;p&gt;&lt;img src=&quot;./jim-carrey-typing.gif&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./8-hours-later.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./i-am-back-01.gif&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;h1&gt;Bingo!&lt;/h1&gt;
&lt;p&gt;The root cause is that the VUE project used router with &lt;code&gt;createWebHistory&lt;/code&gt; instead of &lt;code&gt;createWebHashHistory&lt;/code&gt;!&lt;/p&gt;
&lt;p&gt;It resulted a differenve ways to handle static assets and routing.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Absolute Paths for Static Assets&lt;/strong&gt;: By default, Vue CLI configures the build to use absolute paths for assets (JS, CSS, images, etc.). When I open the &lt;code&gt;index.html&lt;/code&gt; file directly in a browser (using the &lt;code&gt;file://&lt;/code&gt; protocol), these paths may not resolve correctly, because they expect to be served from a web server&apos;s root.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Single Page Application (SPA) Routing&lt;/strong&gt;: Vue applications, especially those built with Vue Router in &lt;code&gt;history&lt;/code&gt; mode, rely on the web server to correctly handle URLs. Directly opening &lt;code&gt;index.html&lt;/code&gt; doesn&apos;t allow Vue Router to intercept and handle the routing, leading to routes possibly not resolving as intended. &lt;code&gt;npm run serve&lt;/code&gt; starts a development server that correctly handles SPA routing, serving &lt;code&gt;index.html&lt;/code&gt; for all routes.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Using &lt;code&gt;createWebHistory&lt;/code&gt; in Production environment is required as it provides several significant benefits:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Clean URLs&lt;/strong&gt;: If having clean, professional-looking URLs is important for your application&apos;s user experience or branding, &lt;code&gt;createWebHistory&lt;/code&gt; is the preferred choice. This is often the case for public-facing production websites.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;SEO Considerations&lt;/strong&gt;: For SEO purposes, clean URLs (without hashes) are generally better. However, modern SEO practices and improved search engine capabilities have mitigated these concerns significantly.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Ease of Deployment&lt;/strong&gt;: &lt;code&gt;createWebHashHistory&lt;/code&gt; is simpler to deploy because it doesn&apos;t require specific server configurations to handle SPA routing. If your hosting environment or knowledge of server configurations is limited, this might be a more straightforward option.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Refresh Behavior&lt;/strong&gt;: With &lt;code&gt;createWebHistory&lt;/code&gt;, directly refreshing or entering URLs can lead to 404 errors if the server isn&apos;t correctly configured to redirect all such requests to &lt;code&gt;index.html&lt;/code&gt;. With &lt;code&gt;createWebHashHistory&lt;/code&gt;, this issue doesn&apos;t arise, making it a more foolproof solution for environments where server control is limited.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I just want to use &lt;code&gt;createWebHashHistory&lt;/code&gt; in my local development environment.&lt;/p&gt;
&lt;h1&gt;The fix&lt;/h1&gt;
&lt;p&gt;Now, the fix is easy.&lt;/p&gt;
&lt;p&gt;First, modify scripts in package.json to specify mode for serve and build, and I added two new items &lt;code&gt;serve_prod&lt;/code&gt; and &lt;code&gt;build_dev&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&quot;scripts&quot;: {
    &quot;serve&quot;: &quot;vue-cli-service serve --mode development&quot;,
    &quot;serve_prod&quot;: &quot;vue-cli-service serve --mode production&quot;,
    &quot;build&quot;: &quot;vue-cli-service build --mode production&quot;,
    &quot;build_dev&quot;: &quot;vue-cli-service build --mode development&quot;
  },
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Second, creating or editing vue.config.js as below:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;module.exports = {
  publicPath: process.env.NODE_ENV === &apos;production&apos;
    ? &apos;/&apos;
    : &apos;&apos;,
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Lastly, update src/router/index.js to handle the mode accordingly:&lt;/p&gt;
&lt;p&gt;The original code was:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;import { createRouter, createWebHistory } from &quot;vue-router&quot;;

# other existing code

const router = createRouter({
  history: createWebHistory(),
  history: history,
  routes,
  linkActiveClass: &quot;active&quot;
});
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now it looks like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;import { createRouter, createWebHistory } from &quot;vue-router&quot;;

# other existing code

// Determine the history mode based on the environment
const history = process.env.NODE_ENV === &apos;production&apos;
  ? createWebHistory()
  : createWebHashHistory();

const router = createRouter({
  history: history,
  routes,
  linkActiveClass: &quot;active&quot;
});

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now, run npm run &lt;code&gt;build_dev&lt;/code&gt; again, I can see the portal 😎&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Thanks for reading!&lt;br /&gt;
Have a good day!&lt;/p&gt;
&lt;/blockquote&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>An Adventurer&apos;s Guide to Base64, Base64URL, and Base32 Encoding</title><link>https://geekcoding101.com/posts/an-adventurers-guide-to-base64-base64url-and-base32-encoding</link><guid isPermaLink="true">https://geekcoding101.com/posts/an-adventurers-guide-to-base64-base64url-and-base32-encoding</guid><pubDate>Wed, 10 Apr 2024 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Hey there!&lt;/p&gt;
&lt;p&gt;Recently, I encountered some encoding issues. Then I realized that, looks like I haven&apos;t seen any articles give a crispy yet interesting explanation on Base64/Base64URL/Base32 encoding! Ah! I should write one!&lt;/p&gt;
&lt;p&gt;So, grab your gear, and let&apos;s decode these fascinating encoding schemes together!&lt;/p&gt;
&lt;h1&gt;The Enigma of Base64 Encoding&lt;/h1&gt;
&lt;p&gt;Why do we need Base64?&lt;/p&gt;
&lt;p&gt;Imagine you&apos;re sending a beautiful picture postcard through the digital world, but the postal service (the internet, in this case) only handles plain text.&lt;/p&gt;
&lt;p&gt;How do you do it?&lt;/p&gt;
&lt;p&gt;Enter Base64 encoding – it&apos;s like magic that transforms binary data (like images) into a text format that can easily travel through the internet without getting corrupted.&lt;/p&gt;
&lt;p&gt;Base64 takes your binary data and represents it as text using 64 different characters:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;10 numeric values i.e., 0, 1, 2, 3, …..9.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;26 Uppercase alphabets i.e., A, B, C, D, …….Z.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;26 Lowercase alphabets i.e., a, b, c, d, ……..z.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Two special characters i.e., &lt;code&gt;+&lt;/code&gt;, &lt;code&gt;/&lt;/code&gt; (typically)&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In more details, it will:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Grouping Bytes:&lt;/strong&gt; It groups the input bytes into sets of three, providing 24 bits in total.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Dividing Bits:&lt;/strong&gt; These 24 bits are then divided into four sets of 6 bits each.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Mapping to Characters:&lt;/strong&gt; Each set of 6 bits is mapped to one of 64 characters in the Base64 alphabet (A-Z, a-z, 0-9, +, and /). Since each set is 6 bits, they can represent values from 0 to 63, perfectly matching the 64 characters in the Base64 alphabet.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Padding:&lt;/strong&gt; If the total number of input bytes is not divisible by three, padding characters (typically &lt;code&gt;=&lt;/code&gt;) are added to make the final encoded output length a multiple of four. This ensures that the encoded data can be evenly divided back into its original byte format during decoding.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;It&apos;s widely used in email attachments, data URLs in web pages, and anywhere you need to squeeze binary data into text-only zones.&lt;/p&gt;
&lt;p&gt;A simple text like &quot;Hello!&quot; when encoded in Base64, turns into &quot;SGVsbG8h&quot;.&lt;/p&gt;
&lt;h2&gt;Usage of Base64 in Data URIs&lt;/h2&gt;
&lt;p&gt;Data URIs (Uniform Resource Identifiers) offer a powerful way to embed binary data, such as images, directly into HTML or CSS files, using Base64 encoding. This method eliminates the need for external file references, resulting in fewer HTTP requests and potentially faster page loads. Here&apos;s how it works in practice:&lt;/p&gt;
&lt;h3&gt;Embedding an Image in HTML Using Data URI&lt;/h3&gt;
&lt;p&gt;Let&apos;s say you have a small logo or icon that you want to include directly in your HTML page without linking to an external file. You can use Base64 to encode the image and then incorporate it directly into an &lt;code&gt;&amp;lt;img&amp;gt;&lt;/code&gt; tag&apos;s &lt;code&gt;src&lt;/code&gt; attribute.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Original Image&lt;/strong&gt;: An image file, &lt;code&gt;logo.png&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Base64 Encoding&lt;/strong&gt;: Convert &lt;code&gt;logo.png&lt;/code&gt; into a Base64-encoded string. The result will be a long text string.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Embed in HTML&lt;/strong&gt;: Use the encoded string within an &lt;code&gt;&amp;lt;img&amp;gt;&lt;/code&gt; tag as follows:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&amp;lt;img src=&quot;data:image/png;base64,Base64EncodedStringHere&quot; alt=&quot;Logo&quot;&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Replace &lt;code&gt;Base64EncodedStringHere&lt;/code&gt; in above with the actual Base64-encoded string of your image. The &lt;code&gt;data:image/png;base64,&lt;/code&gt; part tells the browser that what follows is a Base64-encoded PNG image.&lt;/p&gt;
&lt;p&gt;Embedding images directly with Data URIs can reduce the number of HTTP requests, speeding up page loads for small resources.&lt;/p&gt;
&lt;h3&gt;Navigating the Waters of Base64URL&lt;/h3&gt;
&lt;p&gt;But, oh, the plot thickens with Base64URL. It&apos;s a close cousin of Base64, tailored for the web.&lt;/p&gt;
&lt;p&gt;The twist? It replaces the &lt;code&gt;+&lt;/code&gt; and &lt;code&gt;/&lt;/code&gt; characters with &lt;code&gt;-&lt;/code&gt; and &lt;code&gt;_&lt;/code&gt; to make it URL and filename safe. No more worrying about those characters being misinterpreted as special URL characters or directory paths!&lt;/p&gt;
&lt;h1&gt;The Expedition to Base32&lt;/h1&gt;
&lt;p&gt;Then, there&apos;s Base32, another encoding scheme in our adventure.&lt;/p&gt;
&lt;p&gt;It&apos;s less compact than Base64 but has its charm, especially when you need to ensure readability and avoid confusion.&lt;/p&gt;
&lt;p&gt;Base32 uses a set of 32 characters, &lt;strong&gt;making it more resilient against errors like misreading or miswriting.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Base32 shines in specific scenarios, such as encoding email addresses in DNS records for email validation (think SPF records) or in situations where you want to avoid characters that could be altered or misinterpreted.&lt;/p&gt;
&lt;h1&gt;Why These Encoding Schemes Matter&lt;/h1&gt;
&lt;p&gt;Why do we bother with all these encoding shenanigans? It&apos;s all about compatibility and safety.&lt;/p&gt;
&lt;p&gt;These encoding schemes allow us to safely transmit binary data over mediums that only support text, ensuring that our data arrives intact and unaltered at its destination.&lt;/p&gt;
&lt;h1&gt;Choosing Your Path&lt;/h1&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Use &lt;strong&gt;Base64&lt;/strong&gt; when you need a compact, text-based representation of binary data for emails, data URIs, and when integrating with APIs that expect data in this format.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Opt for &lt;strong&gt;Base64URL&lt;/strong&gt; when your data needs to be part of URLs or file names, ensuring a smooth and safe journey through the web.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Choose &lt;strong&gt;Base32&lt;/strong&gt; for maximum readability and error resilience, perfect for transmitting data that might be entered manually or when you want to avoid certain problematic characters.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h1&gt;Alternatives and Mysteries Beyond&lt;/h1&gt;
&lt;p&gt;Our adventure doesn’t end here. There are other encoding schemes like Base58, popularized by Bitcoin, which further reduces the chance of misinterpretation by excluding similar-looking characters. And let&apos;s not forget hexadecimal encoding, a simpler form often used in programming and debugging.&lt;/p&gt;
&lt;p&gt;In conclusion, whether you’re encoding treasure maps to share with your fellow digital pirates or simply ensuring that your data travels safely across the vast internet, understanding when and how to use these encoding schemes is an essential skill in the digital world.&lt;/p&gt;
&lt;p&gt;Remember, the right encoding at the right time can be the difference between smooth sailing and getting lost in the digital sea.&lt;/p&gt;
&lt;p&gt;So, choose wisely!&lt;/p&gt;
&lt;p&gt;Until our next digital odyssey, keep exploring and encoding.&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Terminal Mastery: Crafting a Productivity Environment with iTerm, tmux, and Beyond</title><link>https://geekcoding101.com/posts/terminal-mastery-crafting-a-productivity-environment-with-iterm-tmux-and-beyond</link><guid isPermaLink="true">https://geekcoding101.com/posts/terminal-mastery-crafting-a-productivity-environment-with-iterm-tmux-and-beyond</guid><pubDate>Thu, 11 Apr 2024 00:00:00 GMT</pubDate><content:encoded>&lt;h1&gt;I love working on Linux terminals&lt;/h1&gt;
&lt;p&gt;Rewind a decade or so, and you&apos;d find me ensconced within the embrace of a Linux terminal for the duration of my day. Here, amidst the digital ebb and flow, I thrived—maneuvering files and folders with finesse, weaving code in Vim, orchestrating services maintenance, decoding kernel dumps, and seamlessly transitioning across a mosaic of tmux sessions.&lt;/p&gt;
&lt;p&gt;The graphical user interface? A distant thought, unnecessary for the tapestry of tasks at hand.&lt;/p&gt;
&lt;p&gt;Like all geeks, every tech enthusiast harbors a unique sanctuary of productivity—a bespoke digital workshop where code flows like poetry, and ideas ignite with the spark of creativity. It’s a realm where custom tools and secret utilities interlace, forming the backbone of unparalleled efficiency and innovation.&lt;/p&gt;
&lt;p&gt;Today, I&apos;m pulling back the curtain to reveal the intricacies of my personal setup on Mac.&lt;/p&gt;
&lt;p&gt;I invite you on this meticulous journey through the configuration of my Mac-based development sanctuary.&lt;/p&gt;
&lt;p&gt;Together, let&apos;s traverse this path, transforming the mundane into the magnificent, one command, one tool, one revelation at a time.&lt;/p&gt;
&lt;h1&gt;iTerm2&lt;/h1&gt;
&lt;p&gt;After account setup on Mac, the initial terminal looks like this when I logged in:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-1024x357.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./boss-kid-boring01.gif&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Let&apos;s equip it with iTerm2!&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-3-1024x344.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;h4&gt;What is iTerm2?&lt;/h4&gt;
&lt;p&gt;iTerm2 is a replacement for Terminal and the successor to iTerm. It works on Macs with macOS 10.14 or newer. iTerm2 brings the terminal into the modern age with features you never knew you always wanted.&lt;/p&gt;
&lt;h4&gt;Why Do I Want It?&lt;/h4&gt;
&lt;p&gt;Check out the impressive &lt;a href=&quot;https://iterm2.com/features.html&quot;&gt;features and screenshots&lt;/a&gt;. If you spend a lot of time in a terminal, then you&apos;ll appreciate all the little things that add up to a lot. It is free software and you can find the source code on &lt;a href=&quot;https://github.com/gnachman/iTerm2&quot;&gt;Github&lt;/a&gt;.&lt;/p&gt;
&lt;h4&gt;How Do I Use It?&lt;/h4&gt;
&lt;p&gt;Try the &lt;a href=&quot;https://iterm2.com/faq.html&quot;&gt;FAQ&lt;/a&gt; or the &lt;a href=&quot;https://iterm2.com/documentation.html&quot;&gt;documentation&lt;/a&gt;. Got problems or ideas? Report them in the &lt;a href=&quot;https://iterm2.com/bugs&quot;&gt;bug tracker&lt;/a&gt;, take it to the &lt;a href=&quot;https://groups.google.com/group/iterm2-discuss&quot;&gt;forum&lt;/a&gt;, or send me email (gnachman at gmail dot com).&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Go ahead to https://iterm2.com/ download it and finish installation. Open it:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-1-1024x432.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Well, still not impressive. But, now you have got all advanced features from iTerm!&lt;/p&gt;
&lt;p&gt;This blog post is not focus on iTerm2, we can discuss later. So I am not going through those fancy features right now, please explore on the official website.&lt;/p&gt;
&lt;p&gt;Let&apos;s start to customize on it.&lt;/p&gt;
&lt;h1&gt;Wait, Nerd ... Font&lt;/h1&gt;
&lt;p&gt;Before we jump on the customization on iTerm2, I want to introduce you Nerd Font.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;It will be required and installed by https://github.com/romkatv/powerlevel10k which we will talk later.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;a href=&quot;https://github.com/ryanoasis/nerd-fonts&quot;&gt;&lt;img src=&quot;./image-2.png&quot; alt=&quot;&quot; /&gt;&lt;/a&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;This project aims to enhance the usability and aesthetic appeal of the development environment without sacrificing the functionality or readability of the text. The added icons can represent common actions or tools in the development workflow, allowing for a more intuitive and visually engaging interface.&lt;/p&gt;
&lt;p&gt;By incorporating icons directly into the fonts, Nerd Fonts allows developers to use these icons across different applications and tools seamlessly, without needing to rely on external libraries or tool-specific extensions. This can simplify setup and configuration across tools and platforms, providing a consistent and enriched visual experience.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h1&gt;Customize iTerm2&lt;/h1&gt;
&lt;p&gt;Just follow me:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-4-1024x414.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-5-1024x522.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Set your favorite character as background image if you like:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-6-1024x642.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Let&apos;s compare previous and now:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;img src=&quot;./image-1024x357.png&quot; alt=&quot;&quot; /&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;img src=&quot;./image-8-1024x600.png&quot; alt=&quot;&quot; /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h1&gt;Oh-my-zsh and powerlevel10k&lt;/h1&gt;
&lt;p&gt;Install &lt;a href=&quot;https://github.com/ohmyzsh/ohmyzsh&quot;&gt;https://github.com/ohmyzsh/ohmyzsh&lt;/a&gt; (to manage zsh):&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;sh -c &quot;$(curl -fsSL https://raw.githubusercontent.com/ohmyzsh/ohmyzsh/master/tools/install.sh)&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./image-9-1024x600.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Install &lt;a href=&quot;https://github.com/romkatv/powerlevel10k&quot;&gt;https://github.com/romkatv/powerlevel10k&lt;/a&gt; for ohmyzsh and configure it (powerlevel10k is a theme of zsh).&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;git clone --depth=1 https://github.com/romkatv/powerlevel10k.git ${ZSH_CUSTOM:-$HOME/.oh-my-zsh/custom}/themes/powerlevel10k
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Set environment:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;echo &quot;source ~/.oh-my-zsh/custom/themes/powerlevel10k/powerlevel10k.zsh-theme&quot; &amp;gt;&amp;gt; ~/.zshrc
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Restart Zsh with &lt;code&gt;exec zsh&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;You should see:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-10-1024x761.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Let&apos;s install the fonts and follow the wizard, choose whatever you like!&lt;/p&gt;
&lt;p&gt;Now it looks much better!&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-12-1024x594.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Much better, but still not meet my expectation!&lt;/p&gt;
&lt;h1&gt;Let&apos;s begin working with our adaptable TMUX expert now!&lt;/h1&gt;
&lt;p&gt;&lt;a href=&quot;https://github.com/tmux/tmux/wiki&quot;&gt;&lt;img src=&quot;./image-13.png&quot; alt=&quot;&quot; /&gt;&lt;/a&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;tmux is a terminal multiplexer. It lets you switch easily between several programs in one terminal, detach them (they keep running in the background) and reattach them to a different terminal.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Tmux should have been installed by default, type tmux, you should see:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-14-1024x597.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./boss-kid-boring01.gif&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Can&apos;t wait to customize it!&lt;/p&gt;
&lt;p&gt;I am using &lt;a href=&quot;https://github.com/gpakosz/.tmux.git&quot;&gt;https://github.com/gpakosz/.tmux.git&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;It&apos;s a Self-contained, pretty and versatile &lt;code&gt;.tmux.conf&lt;/code&gt; configuration file.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;pre&gt;&lt;code&gt;cd ~
rm -fr .tmux
git clone https://github.com/gpakosz/.tmux.git
ln -s -f .tmux/.tmux.conf
cp .tmux/.tmux.conf.local .
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Append below into &lt;code&gt;.tmux.conf.local&lt;/code&gt; before the line &quot;&lt;code&gt;# -- custom variables&lt;/code&gt;&quot;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# increase history size
set -g history-limit 9999999
# start with mouse mode enabled
set -g mouse on

bind-key -n C-S-Left swap-window -t -1\; select-window -t -1
bind-key -n C-S-Right swap-window -t +1\; select-window -t +1

# -- custom variables ----------------------------------------------------------
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I am a fan of Vi/Vim, must enable Vi mode in &quot;&lt;code&gt;~/.tmux.conf.local&lt;/code&gt;&quot;:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-16-1024x101.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Customize status bar:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;tmux_conf_theme_status_right_fg=&quot;$tmux_conf_theme_colour_12,$tmux_conf_theme_colour_14,$tmux_conf_theme_colour_6&quot;
tmux_conf_theme_status_right_bg=&quot;$tmux_conf_theme_colour_15,$tmux_conf_theme_colour_17,$tmux_conf_theme_colour_9&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./image-17-1024x131.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;tmux_conf_theme_left_separator_main=&apos;\uE0B0&apos;
tmux_conf_theme_left_separator_sub=&apos;\uE0B1&apos;
tmux_conf_theme_right_separator_main=&apos;\uE0B2&apos;
tmux_conf_theme_right_separator_sub=&apos;\uE0B3&apos;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./image-18-1024x212.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Find below lines in &quot;&lt;code&gt;~/.tmux.conf.local&lt;/code&gt;&quot; and uncomment them to enable:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-19-1024x81.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Set a icon for the left status:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-20-1024x17.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Now reload configuration:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;tmux source ~/.tmux.conf
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Check it now!&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-22-1024x557.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;I have a script to launch Tmux when starting iTerm2, here you go:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;#!/bin/zsh

tmux ls|grep kongfu | grep -q attached

if [[ $? != 0 ]] ; then
  tmux attach -t kongfu  ||  tmux new-session -s kongfu
else
  echo &quot;********************************************************************************&quot;
  echo &quot;* Ignore attaching tmux kongfu session as it has been attached already.        *&quot;
  echo &quot;********************************************************************************&quot;
fi
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Save it &quot;&lt;code&gt;~/bin/tmux_init.sh&lt;/code&gt;&quot; and &quot;&lt;code&gt;chmod 755 ~/bin/tmux_init.sh&lt;/code&gt;&quot;, configure it in iTerm2 default profile:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-23-1024x692.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;h1&gt;Customize P10K Status Bar for Anaconda and Node.js&lt;/h1&gt;
&lt;p&gt;I have Anaconda and Node.js environments.&lt;/p&gt;
&lt;p&gt;I am not satisfied with the default color settings for Anaconda and node.js. It&apos;s ugly.&lt;/p&gt;
&lt;p&gt;Open your &lt;code&gt;~/.p10k.zsh&lt;/code&gt;, make below changes:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-24-1024x99.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-25-1024x94.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-26-1024x316.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Now resource the file:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;source ~/.p10k.zsh
&lt;/code&gt;&lt;/pre&gt;
&lt;h1&gt;Show time!&lt;/h1&gt;
&lt;p&gt;I recorded this to show you how it looks like on my environment:&lt;/p&gt;
&lt;p&gt;https://www.youtube.com/watch?v=TBfvoSeyP4U&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Fix Font in VSCode Terminal</title><link>https://geekcoding101.com/posts/fix-font-in-vscode-terminal</link><guid isPermaLink="true">https://geekcoding101.com/posts/fix-font-in-vscode-terminal</guid><pubDate>Sat, 13 Apr 2024 00:00:00 GMT</pubDate><content:encoded>&lt;h1&gt;The Font Problem in VSCode&lt;/h1&gt;
&lt;p&gt;After done the configuration in &lt;a href=&quot;/posts/terminal-mastery-crafting-a-productivity-environment-with-iterm-tmux-and-beyond&quot;&gt;Terminal Mastery: Crafting A Productivity Environment With ITerm, Tmux, And Beyond&lt;/a&gt;, we got a nice terminal:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-27-1024x485.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;However, after I installed VSCode, the terminal couldn&apos;t display certain glyphs, it looks like this:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./QEAQQQQAABBBBAAAEEEEAAAQQQQAABBCpaoCSBTRfp2bOn6UNBAAEEEEAAAQQQQAABBBBAAAEEEEAAAQSKKVDUlwcVs2OsCwEEEEAAAQQQQAABBBBAAAEEEEAAAQQQyCZAYDObDPUIIIAAAggggAACCCCAAAIIIIAAAgggEFsBApuxPTR0DAEEEEAAAQQQQAABBBBAAAEEEEAAAQSyCRDYzCZDPQIIIIAAAggggAACCCCAAAIIIIAAAgjEVoDAZmwPDR1DAAEEEEAAAQQQQAABBBBAAAEEEEAAgWwCBDazyVCPAAIIIIAAAggggAACCCCAAAIIIIAAArEVILAZ20NDxxBAAAEEEEAAAQQQQAABBBBAAAEEEEAgmwCBzWwy1COAAAIIIIAAAggggAACCCCAAAIIIIBAbAUIbMb20NAxBBBAAAEEEEAAAQQQQAABBBBAAAEEEMgmQGAzmwz1CCCAAAIIIIAAAggggAACCCCAAAIIIBBbAQKbsT00dAwBBBBAAAEEEEAAAQQQQAABBBBAAAEEsgm02rNnz8FsM6lHAAEEEEAAAQQQQAABBBBAAAEEEEAAAQTiKEDGZhyPCn1CAAEEEEAAAQQQQAABBBBAAAEEEEAAgUgBApuRPMxEAAEEEEAAAQQQQAABBBBAAAEEEEAAgTgKENiM41GhTwgggAACCCCAAAIIIIAAAggggAACCCAQKUBgM5KHmQgggAACCCCAAAIIIIAAAggggAACCCAQRwECm3E8KvQJAQQQQAABBBBAAAEEEEAAAQQQQAABBCIFCGxG8jATAQQQQAABBBBAAAEEEEAAAQQQQAABBOIoQGAzjkeFPiGAAAIIIIAAAggggAACCCCAAAIIIIBApACBzUgeZiKAAAIIIIAAAggggAACCCCAAAIIIIBAHAUIbMbxqNAnBBBAAAEEEEAAAQQQQAABBBBAAAEEEIgUILAZycNMBBBAAAEEEEAAAQQQQAABBBBAAAEEEIijAIHNOB4V+oQAAggggAACCCCAAAIIIIAAAggggAACkQIENiN5mIkAAggggAACCCCAAAIIIIAAAggggAACcRQgsBnHo0KfEEAAAQQQQAABBBBAAAEEEEAAAQQQQCBSgMBmJA8zEUAAAQQQQAABBBBAAAEEEEAAAQQQQCCOAgQ243hU6BMCCCCAAAIIIIAAAggggAACCCCAAAIIRAoQ2IzkYSYCCCCAAAIIIIAAAggggAACCCCAAAIIxFGAwGYcjwp9QgABBBBAAAEEEEAAAQQQQAABBBBAAIFIAQKbkTzMRAABBBBAAAEEEEAAAQQQQAABBBBAAIE4ChDYjONRoU8IIIAAAggggAACCCCAAAIIIIAAAgggEClAYDOSh5kIIIAAAggggAACCCCAAAIIIIAAAgggEEcBAptxPCr0CQEEEEAAAQQQQAABBBBAAAEEEEAAAQQiBQhsRvIwEwEEEEAAAQQQQAABBBBAAAEEEEAAAQTiKEBgM45HhT4hgAACCCCAAAIIIIAAAggggAACCCCAQKQAgc1IHmYigAACCCCAAAIIIIAAAggggAACCCCAQBwFCGzG8ajQJwQQQAABBBBAAAEEEEAAAQQQQAABBBCIFCCwGcnDTAQQQAABBBBAAAEEEEAAAQQQQAABBBCIowCBzTgeFfqEAAIIIIAAAggggAACCCCAAAIIIIAAApECBDYjeZiJAAIIIIAAAggggAACCCCAAAIIIIAAAnEUILAZx6NCnxBAAAEEEEAAAQQQQAABBBBAAAEEEEAgUoDAZiQPMxFAAAEEEEAAAQQQQAABBBBAAAEEEEAgjgIENuN4VOgTAggggAACCCCAAAIIIIAAAggggAACCEQKENiM5GEmAggggAACCCCAAAIIIIAAAggggAACCMRRgMBmHI8KfUIAAQQQQAABBBBAAAEEEEAAAQQQQACBSAECm5E8zEQAAQQQQAABBBBAAAEEEEAAAQQQQACBOAoQ2IzjUaFPCCCAAAIIIIAAAggggAACCCCAAAIIIBApQGAzkoeZCCCAAAIIIIAAAggggAACCCCAAAIIIBBHAQKbcTwq9AkBBBBAAAEEEEAAAQQQQAABBBBAAAEEIgUIbEbyMBMBBBBAAAEEEEAAAQQQQAABBBBAAAEE4ihAYDOOR4U+IYAAAggggAACCCCAAAIIIIAAAggggECkAIHNSB5mIoAAAggggAACCCCAAAIIIIAAAggggEAcBQhsxvGo0CcEEEAAAQQQQAABBBBAAAEEEEAAAQQQiBQgsBnJw0wEEEAAAQQQQAABBBBAAAEEEEAAAQQQiKMAgc04HhX6hAACCCCAAAIIIIAAAggggAACCCCAAAKRAgQ2I3mYiQACCCCAAAIIIIAAAggggAACCCCAAAJxFCCwGcejQp8QQAABBBBAAAEEEEAAAQQQQAABBBBAIFKAwGYkDzMRQAABBBBAAAEEEEAAAQQQQAABBBBAII4CBDbjeFToEwIIIIAAAggggAACCCCAAAIIIIAAAghEChDYjORhJgIIIIAAAggggAACCCCAAAIIIIAAAgjEUYDAZhyPCn1CAAEEEEAAAQQQQAABBBBAAAEEEEAAgUgBApuRPMxEAAEEEEAAAQQQQAABBBBAAAEEEEAAgTgKENiM41GhTwgggAACCCCAAAIIIIAAAggggAACCCAQKUBgM5KHmQgggAACCCCAAAIIIIAAAggggAACCCAQRwECm3E8KvQJAQQQQAABBBBAAAEEEEAAAQQQQAABBCIFCGxG8jATAQQQQAABBBBAAAEEEEAAAQQQQAABBOIoQGAzjkeFPiGAAAIIIIAAAggggAACCCCAAAIIIIBApACBzUgeZiKAAAIIIIAAAggggAACCCCAAAIIIIBAHAUIbMbxqNAnBBBAAAEEEEAAAQQQQAABBBBAAAEEEIgUILAZycNMBBBAAAEEEEAAAQQQQAABBBBAAAEEEIijAIHNOB4V+oQAAggggAACCCCAAAIIIIAAAggggAACkQIENiN5mIkAAggggAACCCCAAAIIIIAAAggggAACcRQgsBnHo0KfEEAAAQQQQAABBBBAAAEEEEAAAQQQQCBSgMBmJA8zEUAAAQQQQAABBBBAAAEEEEAAAQQQQCCOAgQ243hU6BMCCCCAAAIIIIAAAggggAACCCCAAAIIRAr8f34PKLmd9NOdAAAAAElFTkSuQmCC&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;h1&gt;The Fix&lt;/h1&gt;
&lt;p&gt;We need to fix it by updating the font family in VSCode.&lt;/p&gt;
&lt;p&gt;1. Identify the name of font family. Open Font Book on Mac, we can see:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-28-1024x413.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;The font supports those glyphs is &quot;MesloLGM Nerd Font Mono&quot;, that&apos;s also what I configured for iTerm2.&lt;/p&gt;
&lt;p&gt;2. Go to VSCode, go to Command + comma, go to settings, search &quot;&lt;code&gt;terminal.integrated.fontFamily&lt;/code&gt;&quot;, set the font name as below:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-30-1024x343.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;3. Now we can see it displays correctly:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-31-1024x298.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Well done!&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Supervised Machine Learning - Day 1</title><link>https://geekcoding101.com/posts/supervised-machine-learning-day-1</link><guid isPermaLink="true">https://geekcoding101.com/posts/supervised-machine-learning-day-1</guid><pubDate>Sun, 14 Apr 2024 00:00:00 GMT</pubDate><content:encoded>&lt;h1&gt;The Beginning&lt;/h1&gt;
&lt;p&gt;As I&apos;ve been advancing technologies of my AI-powered product knowlege base chatbot which based on Django/LangChain/OpenAI/Chroma/Gradio which is sitting on AI application/framework layer, I also have kept an eye on how to build a pipeline for assessing the accuracy of machine learning models which is a part of AI Devops/infra.&lt;/p&gt;
&lt;p&gt;But I realized that I have no idea how to meature a model&apos;s accuracy. This makes me upset.&lt;/p&gt;
&lt;p&gt;Then I started looking for answers.&lt;/p&gt;
&lt;p&gt;My first google search on this is &quot;how to measure llm accuracy&quot;, it brought me to &lt;a href=&quot;https://www.linkedin.com/pulse/evaluating-large-language-models-llms-standard-set-metrics-biswas-ecjlc/&quot;&gt;Evaluating Large Language Models (LLMs): A Standard Set of Metrics for Accurate Assessment&lt;/a&gt;, it&apos;s informative. It&apos;s not a lengthy article and I read through it. This opens a new world to me.&lt;/p&gt;
&lt;p&gt;There are standard set of metrics for evaluating LLMs, including:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Perplexity&lt;/strong&gt; - A measure of how well a language model predicts a sample of text. It is calculated as the inverse probability of the test set normalized by the number of words.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Accuracy&lt;/strong&gt; - It is a measure of how well a language model makes correct predictions. It is calculated as the number of correct predictions divided by the total number of predictions.&lt;br /&gt;
Accuracy can be calculated using the following formula: accuracy = (number of correct predictions) / (total number of predictions).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;F1-score&lt;/strong&gt; - It is a measure of a language model&apos;s balance between precision and recall. It is calculated as the harmonic mean of precision and recall.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;ROUGE score&lt;/strong&gt; - It is a measure of how well a language model generates text that is similar to reference texts. It is commonly used for text generation tasks such as summarization and paraphrasing.&lt;br /&gt;
There are different ways to calculate ROUGE score, including ROUGE-N, ROUGE-L, and ROUGE-W.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;BLEU score&lt;/strong&gt; - This is to measure how well a language model generates text that is fluent and coherent. It is commonly used for text generation tasks such as machine translation and image captioning.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;METEOR score&lt;/strong&gt; - It is about how well a language model generates text that is accurate and relevant. It combines both precision and recall to evaluate the quality of the generated text.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Question answering metrics&lt;/strong&gt; - Question answering metrics are used to evaluate the ability of a language model to provide correct answers to questions. Common metrics include accuracy, F1-score, and Macro F1-score.&lt;br /&gt;
Question answering metrics can be calculated by comparing the generated answers to one or more reference answers and calculating a score based on the overlap between them.&lt;br /&gt;
Lets say we have a language model that is trained to answer questions about a given text. We test the model on a set of 100 questions, and the generated answers are compared to the actual answers. The accuracy, F1-score, and Macro F1-score of the model are calculated based on the overlap between the generated answers and the actual answers.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Sentiment analysis metrics&lt;/strong&gt; - Sentiment analysis metrics are used to evaluate the ability of a language model to classify sentiments correctly. Common metrics include accuracy, weighted accuracy, and macro F1-score.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Named entity recognition metrics&lt;/strong&gt; - It is used to evaluate the ability of a language model to identify entities correctly. Common metrics include accuracy, precision, recall, and F1-score.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Contextualized word embeddings&lt;/strong&gt; - It is used to evaluate the ability of a language model to capture context and meaning in word representations. They are generated by training the language model to predict the next word in a sentence given the previous words.&lt;br /&gt;
Lets say we have a language model that is trained to generate word embeddings for a given text. We test the model on a set of 100 texts, and the generated embeddings are compared to the actual embeddings. The evaluation can be done using various methods, such as cosine similarity and Euclidean distance.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;I don&apos;t know all of them and where to start!&lt;/p&gt;
&lt;p&gt;I have to tell meself, &quot;Man, you don&apos;t know machine learning...&quot; So my next search was &quot;machine learning course&quot;, Andrew Ng&apos;s &lt;a href=&quot;https://www.coursera.org/learn/machine-learning&quot;&gt;Supervised Machine Learning: Regression and Classification&lt;/a&gt; now came on top of the google search results! It&apos;s so famous and I knew this before!&lt;/p&gt;
&lt;p&gt;Then I made a decision, I want to take action now and finish it thoroughly!&lt;/p&gt;
&lt;p&gt;I immedially enrolled into the course. Now let&apos;s start the journey!&lt;/p&gt;
&lt;h1&gt;Day 1 Started&lt;/h1&gt;
&lt;h2&gt;Basics&lt;/h2&gt;
&lt;p&gt;1. What is ML?&lt;/p&gt;
&lt;p&gt;Defined by Arthur Samuel back in the 1950 😯&lt;/p&gt;
&lt;p&gt;&quot;&lt;em&gt;Field of study that gives computers the ability to learn &lt;strong&gt;without being explicitly programmed&lt;/strong&gt;.&lt;/em&gt;&quot;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The above claims gaves the key point (The highlighted part) which could answer the question from one of my colleague who asked me &quot;what&apos;s the difference between a programed system trigerring alert on events than a AI-powered system which also triggering alert on events.&quot;, he couldn&apos;t tell the differences. One that day, I&apos;ve tried to explain but couldn&apos;t find the right word. Now I found it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;2. What are major ML algorithms?&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Supervised Learning&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Unsupervised Learning&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Reinforcement Learning&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;3. Supervised Learning&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Examples: visual inspection, self-driving car, online advertising, machine translation, speech recognition, spam filtering.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;3.1. Regression&lt;/strong&gt; (Example, house price prediction)&lt;/p&gt;
&lt;p&gt;By regression, we mean by predicting a number from infinitely many possible output&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;3.2. Classification&lt;/strong&gt; (Example, Breast Cancer Detection)&lt;/p&gt;
&lt;p&gt;Malignant, Benign&lt;/p&gt;
&lt;p&gt;By classification, we mean by predicting categories, categories don&apos;t have to be numbers, it could be non-numeric.&lt;/p&gt;
&lt;p&gt;It can predict whether a picture is that of a cat or a dog. And it can predict if a tumor is benign or malignant. Categories can also be numbers like 0, 1 or 0, 1, 2. But what makes classification different from regression when you&apos;re interpreting the numbers is that classification predicts a small finite limited set of possible output categories such as 0, 1 and 2 but not all possible numbers in between like 0.5 or 1.7.&lt;/p&gt;
&lt;p&gt;First example for predicting cancer has only one input, the size of the tumor. Classification also works with more inputs, like this:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./classification-example2.png&quot; alt=&quot;&quot; title=&quot;classification-example2&quot; /&gt;&lt;/p&gt;
&lt;p&gt;The key is to find out the boundary of benign and malignant.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;4. Unsupervised Learning&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Unsupervised learning is to find something interesting in unlabled data, which involved algorithms like clustering algorithm.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;4.1 Clustering algorithm&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Google news used clustering algorithm.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;In other workds, clustering algorithm, takes data without labels and tries to automatically group them (similar) into clusters by finding some structure or some pattern  or something interesting in the data.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;4.2. Anomaly Detection&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Find unusual data.&lt;/p&gt;
&lt;p&gt;4.3 Dementionality Reduction&lt;/p&gt;
&lt;p&gt;Compress data using fewer numbers.&lt;/p&gt;
&lt;h2&gt;Regression Model&lt;/h2&gt;
&lt;p&gt;Selling house example, building a linear model to predict how much the house could sell for.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Any supervised learning model that predicts a number such as 220,000 or 1.5 or negative 33.2 is addressing what&apos;s called a regression problem. &lt;br /&gt;
Linear regression is one example of a regression model.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Linear Regression&lt;/h3&gt;
&lt;p&gt;$$f_{w,b}(x) = wx + b$$&lt;/p&gt;
&lt;p&gt;Or simpler format:&lt;/p&gt;
&lt;p&gt;$$f(x) = wx + b$$&lt;/p&gt;
&lt;p&gt;Now the question is, how do you find values for &lt;em&gt;$w$&lt;/em&gt; and &lt;em&gt;$b$&lt;/em&gt; so that the prediction $\hat{y}^{(i)}$ is close to the true target $y^{(i)}$ for many or maybe all training examples $(x^{(i)}, y^{(i)})$?&lt;/p&gt;
&lt;p&gt;How to measure how well a line fits the training data?&lt;/p&gt;
&lt;p&gt;$$ (\hat{y}^{(i)} - y^{(i)})^2 $$&lt;/p&gt;
&lt;p&gt;When measuring the error &lt;code&gt;(ŷ - y)&lt;/code&gt;, for example i, we&apos;ll compute this squared error term.&lt;/p&gt;
&lt;p&gt;Square error cost function:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./square-error-cost-function.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Well, I spent almost 20mins to figure out how to write math formula on Wordpress, look! 🥳&lt;/p&gt;
&lt;p&gt;$$J(w, b) = \frac{1}{2m} \sum_{i=1}^{m}(\hat{y}^{(i)}-y^{(i)})^{2}$$&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The extra division by 2 is just meant to make some of our later calculations look neater, but the cost function still works whether you include this division by 2 or not.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The final one:&lt;/p&gt;
&lt;p&gt;$$J(w, b) = \frac{1}{2m} \sum_{i=1}^{m}(f_{w,b}(x^{(i)})-y^{(i)})^{2}$$&lt;/p&gt;
&lt;p&gt;This graph compare (when b = 0) is important!&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-33-1024x511.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;h2&gt;The Challenge Part in Today&apos;s Learning Journey&lt;/h2&gt;
&lt;p&gt;&quot;Visualizing the cost function&quot; is most challenge...&lt;/p&gt;
&lt;p&gt;I watched it at least twice then got the idea.&lt;/p&gt;
&lt;h2&gt;Compare Regression/Classification Models&lt;/h2&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Items&lt;/th&gt;
&lt;th&gt;Regression&lt;/th&gt;
&lt;th&gt;Classfication&lt;/th&gt;
&lt;th&gt;Comments&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Output&lt;/td&gt;
&lt;td&gt;Infinitely&lt;/td&gt;
&lt;td&gt;Finite&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h1&gt;Terminology&lt;/h1&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Term&lt;/th&gt;
&lt;th&gt;Comments&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Discrete Category&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Training Set&lt;/td&gt;
&lt;td&gt;The data set we used to train model&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;input variable or feature or input feature&lt;/td&gt;
&lt;td&gt;Denote as $x$&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;target variable&lt;/td&gt;
&lt;td&gt;Denote as $y$&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$(x, y)$&lt;/td&gt;
&lt;td&gt;Indicate the single training example&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$(x^{(i)}, y^{(i)})$&lt;/td&gt;
&lt;td&gt;$i$-th training example&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;hypothesis&lt;/td&gt;
&lt;td&gt;&quot;$f$&quot; means function, historically, it&apos;s called hypothesis.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$\hat{y}$&lt;/td&gt;
&lt;td&gt;y hat (On Mac press Option + i then followed by the letter y, then you can get ŷ). In machine learning, the convention is that y-hat is the estimate or the prediction for y.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$y$&lt;/td&gt;
&lt;td&gt;When the symbol is just the letter y, then that refers to the target, which is the actual true value in the training set.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;$\hat{y} - y$&lt;/td&gt;
&lt;td&gt;This differences is called &quot;Error&quot;. We&apos;re measuring how far off to prediction is from the target&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;parabola curve&lt;/td&gt;
&lt;td&gt;You know, quadratic function&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Univariate&lt;/td&gt;
&lt;td&gt;Uni means one in Latin. Univariate is just a fancy way of saying one variable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;parameters/coefficients/weights&lt;/td&gt;
&lt;td&gt;In machine learning parameters of the model are the variables you can adjust during training in order to improve the model.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;downward-sloping line&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;hammock&lt;/td&gt;
&lt;td&gt;Have some fun...&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;contour plot&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;topographical&lt;/td&gt;
&lt;td&gt;Learning some geography when learning ML ...&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;gradient descent&lt;/td&gt;
&lt;td&gt;This algorithm is one of the most important algorithms in machine learning. Gradient descent and variations on gradient descent are used to train, not just linear regression, but some of the biggest and most complex models in all of AI.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Supervised Machine Learning – Day 2 &amp;amp; 3 - On My Way To Becoming A Machine Learning Person</title><link>https://geekcoding101.com/posts/supervised-machine-learning-day-2-3-on-my-way-to-becoming-a-machine-learning-person</link><guid isPermaLink="true">https://geekcoding101.com/posts/supervised-machine-learning-day-2-3-on-my-way-to-becoming-a-machine-learning-person</guid><pubDate>Tue, 16 Apr 2024 00:00:00 GMT</pubDate><content:encoded>&lt;h1&gt;A brief introduction&lt;/h1&gt;
&lt;p&gt;Day 2 I was busy and managed only 15 mins for &quot;Supervised Machine Learning&quot; video + 15 mins watched &lt;a href=&quot;https://www.youtube.com/watch?v=aircAruvnKk&amp;amp;list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi&quot;&gt;But what is a neural network? | Chapter 1, Deep learning&lt;/a&gt; from &lt;a href=&quot;https://www.youtube.com/@3blue1brown&quot;&gt;3blue1brown&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Day 3 I managed 30+ mins on &quot;Supervised Machine Learning&quot;, and some time spent on articles reading, like Parul Pandey&apos;s &lt;a href=&quot;https://towardsdatascience.com/understanding-the-mathematics-behind-gradient-descent-dde5dc9be06e&quot;&gt;Understanding the Mathematics behind Gradient Descent&lt;/a&gt;, it&apos;s really good. I like math 😂&lt;/p&gt;
&lt;p&gt;So this notes are mixed.&lt;/p&gt;
&lt;h1&gt;Notes&lt;/h1&gt;
&lt;h2&gt;Implementing gradient descent&lt;/h2&gt;
&lt;h3&gt;Notation&lt;/h3&gt;
&lt;p&gt;I was struggling in writting Latex for the formula, then found this table is useful (&lt;a href=&quot;https://ctan.math.utah.edu/ctan/tex-archive/macros/latex/contrib/mlmath/mlmath.pdf&quot;&gt;source is here&lt;/a&gt;):&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-32-1024x1024.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Andrew said I don&apos;t need to worry about the derivative and calculas at all, I trust him, next I dived into my bookcases and found out my advanced mathmatics books used in college, and spent 15 mins to review, yes, I don&apos;t need.&lt;/p&gt;
&lt;p&gt;Snapped two epic shots of my &quot;Advanced Mathematics&quot; book used in my college time to show off my killer skills in derivative and calculus - pretty sure I&apos;ve unlocked Math Wizard status!&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./adv-math-01-768x1024.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./adv-math-02-768x1024.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;h2&gt;Reading online&lt;/h2&gt;
&lt;p&gt;Okay. Reading some online articles.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;If we are able to compute the derivative of a function, we know in which direction to proceed to minimize it (for the cost function).&lt;/p&gt;
&lt;p&gt;From Parul Pandey, &lt;a href=&quot;https://towardsdatascience.com/understanding-the-mathematics-behind-gradient-descent-dde5dc9be06e&quot;&gt;Understanding the Mathematics behind Gradient Descent&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Parul briefly introduced Power rule and Chain rule, fortunately, I still remember them learnt from colleage. I am so proud.&lt;/p&gt;
&lt;p&gt;After reviewing various explanations of gradient descent, I truly appreciate Andrew&apos;s straightforward and precise approach!&lt;/p&gt;
&lt;p&gt;He was kiddish some times drawing a stick man walking down a hill step by step, suggesting that one might imagine flowers in the valley and clouds in the sky. This comfortable and engaging method made me forget I was in learning mode!&lt;/p&gt;
&lt;h2&gt;The hardest part still comes&lt;/h2&gt;
&lt;p&gt;But anyway, the hardest part still comes, I need to master this at the end:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-35-1024x372.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;$$
\text{repeat until convergence: } \left{
\begin{aligned}
w &amp;amp;= w - \alpha \frac{\partial J(w,b)}{\partial w} \
b &amp;amp;= b - \alpha \frac{\partial J(w,b)}{\partial b}
\end{aligned}
\right}
$$&lt;/p&gt;
&lt;h2&gt;Learning Rate&lt;/h2&gt;
&lt;p&gt;If ⍺ is too small -&amp;gt; baby step, taking long time to get minimum.&lt;/p&gt;
&lt;p&gt;If ⍺ is too big -&amp;gt; stride, may overshot, may never reach minimum. Fail to converge, even diverge.&lt;/p&gt;
&lt;p&gt;What if w is near at the local minimum?&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;The derivative will become smaller.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The updated step will also become smaller&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Above will result reaching minimum without decreasing learning rate.&lt;/p&gt;
&lt;p&gt;At here, the conclusion came:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;So that&apos;s the gradient descent algorithm, you can use it to try to minimize any cost function J. &lt;br /&gt;
Not just the mean squared error cost function that we&apos;re using for the new regression.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Tips:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Used derivative (partial derivative)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Needs to update w and b simultaneously. This requires to use a temp variable (Of course, it&apos;s a standard practice in programing)&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;Finished Week 1&apos;s course!&lt;/h2&gt;
&lt;p&gt;It&apos;s a milestone to me!&lt;/p&gt;
&lt;p&gt;I used 3 days (technically two days) finished the first week&apos;s course!&lt;/p&gt;
&lt;p&gt;I love ML! Let&apos;s continue the momotum!&lt;/p&gt;
&lt;h1&gt;Terminology&lt;/h1&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Term&lt;/th&gt;
&lt;th&gt;Comments&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Squared error cost function&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Local minimal&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tangent line&lt;/td&gt;
&lt;td&gt;Andrew introduced this when trying to show how derivative impact the cost.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Converge/Diverge&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Convex function&lt;/td&gt;
&lt;td&gt;It has a single global minimum because of this bowl-shape.    The technical term for this is that this cost function is a convex function.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Batch Gradient Descent&lt;/td&gt;
&lt;td&gt;&quot;Batch&quot;: Each step of gradient desent uses all the training examples.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Mastering Multiple Features &amp; Vectorization: Supervised Machine Learning – Day 4 and 5</title><link>https://geekcoding101.com/posts/mastering-multiple-features-vectorization-supervised-machine-learning-day-4-and-5</link><guid isPermaLink="true">https://geekcoding101.com/posts/mastering-multiple-features-vectorization-supervised-machine-learning-day-4-and-5</guid><pubDate>Thu, 18 Apr 2024 00:00:00 GMT</pubDate><content:encoded>&lt;h1&gt;So difficult to manage some time on this&lt;/h1&gt;
&lt;p&gt;Day 4 was a long day for me, just got 15mins before bed to quickly skim through the video &quot;multiple features&quot; and &quot;vectorization part 1&quot;.&lt;/p&gt;
&lt;p&gt;Day 5, a longer day than yesterday... went to urgent care in morning... then back-to-back meeting after come back.... lunch... back-to-back meeting again... need to step out again...&lt;/p&gt;
&lt;p&gt;Anyway, that&apos;s life.&lt;/p&gt;
&lt;h1&gt;Multiple features (variables) and Vectorization&lt;/h1&gt;
&lt;p&gt;In &quot;multiple features&quot;, Andrew uses crispy language explained how to simplify the multiple features formula by using vector and dot product.&lt;/p&gt;
&lt;p&gt;In &quot;Part 1&quot;, Andrew introduced how to use NumPy to do dot product and said GPU is good at this type of calculation. Numpy function can use parallel hardware (like GPU) to make dot product fast.&lt;/p&gt;
&lt;p&gt;In &quot;Part 2&quot;, Andrew further introduced why computer can do dot product fast. He used gradient descent as an example.&lt;/p&gt;
&lt;p&gt;The lab was informative, I walked through all of them though I&apos;ve known most of them before.&lt;/p&gt;
&lt;p&gt;More linkes about &lt;a href=&quot;https://www.geeksforgeeks.org/vectorization-techniques-in-nlp/&quot;&gt;Vectorization can be find here&lt;/a&gt;.&lt;/p&gt;
&lt;h1&gt;Questions for helping myself learning&lt;/h1&gt;
&lt;p&gt;I created the following questions to test my knowledge later.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Why is there a arrow hat on variable?&lt;br /&gt;
A: Optional, but nice to have to indicate it&apos;s a vector not a number.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;What does dot product do?&lt;br /&gt;
A: the dot products of two vectors of two lists of numbers W and X, is computed by checking the corresponding pairs of numbers.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;What is multiple linear regression?&lt;br /&gt;
A: It&apos;s the name for the type of linear regression model with multiple input features, which is contrast of &quot;univariate regression&quot;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Is multivariate regression same as &quot;multiple linear regression&quot;?&lt;br /&gt;
A: No. multivariate regression is something else.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Why do we need to vectorization?&lt;br /&gt;
A: Codes can perform calculations in much less time than codes without vectorization on specialized HW. &lt;br /&gt;
This matters more when you&apos;re running algorithms on large data sets or trying to train large models, which is often the case with machine learning.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;img src=&quot;./AYbduL58LYRXAAAAAElFTkSuQmCC&quot; alt=&quot;vectorization and multiple feature&quot; title=&quot;vectorization and multiple feature&quot; /&gt;&lt;/p&gt;
&lt;p&gt;What is x(4)1 in above graph?&lt;/p&gt;
&lt;p&gt;Ps. feel free to check out the series of my &lt;a href=&quot;http://localhost:4321/tags/machine-learning&quot;&gt;Supervised Machine Learning journey&lt;/a&gt;.&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Supervised Machine Learning – Day 6</title><link>https://geekcoding101.com/posts/supervised-machine-learning-day-6-7</link><guid isPermaLink="true">https://geekcoding101.com/posts/supervised-machine-learning-day-6-7</guid><pubDate>Fri, 19 Apr 2024 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Today I spent 10 30 &lt;strong&gt;60&lt;/strong&gt; mins reviewing previous notes, just realized that&apos;s a lot.&lt;/p&gt;
&lt;p&gt;I am amazing 🤩&lt;/p&gt;
&lt;p&gt;Today start with &quot;Gradient descent for multiple linear regression&quot;.&lt;/p&gt;
&lt;h1&gt;Gradient descent for multiple linear regression&lt;/h1&gt;
&lt;p&gt;Holy... At the beginning Andrew throw out below and said hope you still remember, I don&apos;t:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-36-1024x299.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Why I couldn&apos;t recognize... Where does this come?&lt;/p&gt;
&lt;p&gt;I spent 30mins to review several previous videos, then found it... The important videos are:&lt;/p&gt;
&lt;p&gt;1. Week 1 &lt;a href=&quot;https://www.coursera.org/learn/machine-learning/lecture/TXDBu/implementing-gradient-descent&quot;&gt;Implementing gradient descent&lt;/a&gt;, Andrew just wrote down below without explaining (he explained later)&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-37.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;2. &lt;a href=&quot;https://www.coursera.org/learn/machine-learning/lecture/lgSMj/gradient-descent-for-linear-regression&quot;&gt;Gradient descent for linear regression&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-38-1024x504.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-40-1024x297.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Holy! Found a mistake in Andrew&apos;s course!&lt;br /&gt;
On above screenshot, Andrew lost x(i) at the end of the first line!&lt;/p&gt;
&lt;p&gt;WOW! I ROCK! Spent almost 60mins!&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I am done for today!&lt;/p&gt;
&lt;h1&gt;The situation reversed an hour later&lt;/h1&gt;
&lt;p&gt;But I felt upset, I was NOT convinced I found the simple mistake especially in Andrew&apos;s most popular Machine learning course!&lt;/p&gt;
&lt;p&gt;I started trying to resolve the fomula.&lt;/p&gt;
&lt;p&gt;And.... I found out I was indeed too young too naive... Andrew was right...&lt;/p&gt;
&lt;p&gt;I got help from my college classmate who has been dealing with calculus everyday for more than 20 years...&lt;/p&gt;
&lt;p&gt;This is the derivation process of the formula written by him:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./derivation-process-1024x500.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;He said this to me like my math teacher in college:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Chain rule, deriving step by step.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If you still remember, I have mentined Parul Pandey’s &lt;a href=&quot;https://towardsdatascience.com/understanding-the-mathematics-behind-gradient-descent-dde5dc9be06e&quot;&gt;Understanding the Mathematics behind Gradient Descent&lt;/a&gt; in my previous post &lt;a href=&quot;/posts/supervised-machine-learning-day-2-3-on-my-way-to-becoming-a-machine-learning-person&quot;&gt;Supervised Machine Learning – Day 2 &amp;amp; 3 – On My Way To Becoming A Machine Learning Person&lt;/a&gt;, in her post she did mentioned:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Primarily we shall be dealing with two concepts from calculus :&lt;/p&gt;
&lt;p&gt;Power rule and chain rule.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Well, she is right as well 😁&lt;/p&gt;
&lt;p&gt;So happy I leant a lot today!&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Finished Machine Learning for Absolute Beginners - Level 1</title><link>https://geekcoding101.com/posts/finished-machine-learning-for-absolute-beginners-level-1</link><guid isPermaLink="true">https://geekcoding101.com/posts/finished-machine-learning-for-absolute-beginners-level-1</guid><pubDate>Thu, 25 Apr 2024 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;img src=&quot;./ml-absolute-beginner-level1-cert-1024x746.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;As you know I was in progress learning Andrew Ng&apos;s &lt;a href=&quot;https://www.coursera.org/learn/machine-learning&quot;&gt;Supervised Machine Learning: Regression and Classification&lt;/a&gt;, it&apos;s so dry!&lt;/p&gt;
&lt;p&gt;So I also spare some time to pick up some easy ML courses to help me to understand.&lt;/p&gt;
&lt;p&gt;Today I came across &lt;a href=&quot;https://www.udemy.com/course/machine-learning-for-absolute-beginners-level-1&quot;&gt;Machine Learning for Absolute Beginners - Level 1&lt;/a&gt; and it&apos;s really easy and friendly to beginner.&lt;/p&gt;
&lt;p&gt;Finished in 2.5 hours - Maybe because I&apos;ve made some good progress in &lt;a href=&quot;https://www.coursera.org/learn/machine-learning&quot;&gt;Supervised Machine Learning: Regression and Classification&lt;/a&gt; and so feel it&apos;s easy.&lt;/p&gt;
&lt;p&gt;I want to share my notes in this blog post.&lt;/p&gt;
&lt;h1&gt;&lt;strong&gt;Applied AI&lt;/strong&gt; or &lt;strong&gt;Shallow AI&lt;/strong&gt;&lt;/h1&gt;
&lt;p&gt;Industry’s robot can handle specific small task which has been programmed, it’s called &lt;strong&gt;Applied AI&lt;/strong&gt; or &lt;strong&gt;Shallow AI&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Under-fitting and over-fitting are challenges for Generalization.&lt;/p&gt;
&lt;h1&gt;Under-fitting&lt;/h1&gt;
&lt;p&gt;The trained model is not working well on the training data and can’t generalize to new data.&lt;/p&gt;
&lt;p&gt;Reasons may be:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;The model was too simple&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Data set is not good enough&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;An idea training process, it would looks like:&lt;/p&gt;
&lt;p&gt;Under fitting….. better fitting…. Good fit&lt;/p&gt;
&lt;h1&gt;Over-fitting&lt;/h1&gt;
&lt;p&gt;The trained model is working well on the training data and can’t generalize well to new data.&lt;/p&gt;
&lt;p&gt;Reasons may be:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Training dataset is not a true distribution of the data (mitigation: use much larger training dataset, using a test dataset)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Too complex model (fit the data as simple as possible)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Small training dataset&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Training dataset (labeled) -&amp;gt; ML &lt;strong&gt;Training phase&lt;/strong&gt; -&amp;gt; Trained Model&lt;/p&gt;
&lt;p&gt;The input (unlabeled dataset) -&amp;gt; processed by Trained model (&lt;strong&gt;inference phase&lt;/strong&gt;) -&amp;gt; output (labeled dataset)&lt;/p&gt;
&lt;p&gt;Approaches or learning algorithms of ML systems can be categorized into:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Supervised Learning&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Unsupervised Learning&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Reinforcement Learning&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h1&gt;&lt;strong&gt;Supervised Learning&lt;/strong&gt;&lt;/h1&gt;
&lt;p&gt;There are two very typical tasks that are performed using supervised learning:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;classification (this is not clustering which belongs to Unsupervised learning!)&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;One of the algorithm: Support Vector Machines (SVM)&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;regression -&amp;gt; Statistical methods for estimating for strength of the relationship between a dependent variable and one or more independent variables.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Learner regression&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Logistic regression&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Polynomial regression&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h1&gt;Shallow Learning&lt;/h1&gt;
&lt;p&gt;One of the common classification algorithms under the shallow learning category is called &lt;strong&gt;Support Vector Machines (SVM)&lt;/strong&gt;.&lt;/p&gt;
&lt;h1&gt;&lt;strong&gt;Unsupervised Learning&lt;/strong&gt;&lt;/h1&gt;
&lt;p&gt;The goal is to identify automatically meaningful patterns in unlabeled data.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Clustering&lt;/strong&gt; -&amp;gt; clustering is the task of identifying similar instances with shared attributes in a data set and group them together into clusters, grouping a set of objects in such a way that objects in the same group are more similar to each other than those in other groups.&lt;br /&gt;
The output of the algorithm will be &lt;strong&gt;a set of labels assigning each data point to one of the identified clusters.&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Customer segmentation, like demographic information&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Anomaly/Outlier detection&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Dimension reduction&lt;/strong&gt; - why we have this: one of the biggest challenges of supervised learning is that there are too many features as input.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Large dataset -&amp;gt; pre-processing (Dimension Reduction) -&amp;gt; produce smaller dataset -&amp;gt; use the dataset for training Supervised learning model.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Image segmentation&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h1&gt;&lt;strong&gt;Semi-supervised Learning&lt;/strong&gt;&lt;/h1&gt;
&lt;p&gt;Sitting between supervised learning and unsupervised learning&lt;/p&gt;
&lt;p&gt;It works like this way:&lt;/p&gt;
&lt;p&gt;Unlabeled dataset -&amp;gt; clustering -&amp;gt; clusters 1, 2, 3… -&amp;gt; Label the dataset -&amp;gt; Labeled dataset can be used for training a  Supervised Learning.&lt;/p&gt;
&lt;h1&gt;&lt;strong&gt;Reinforcement Learning&lt;/strong&gt;&lt;/h1&gt;
&lt;p&gt;Completely different with above (supervised and unsupervised).&lt;/p&gt;
&lt;p&gt;It’s not using a group of labeled or unlabeled examples.&lt;/p&gt;
&lt;p&gt;Used as a framework for decision-making tasks based on goals.&lt;/p&gt;
&lt;p&gt;Perform a complex objective while performing multiple sequences of actions.&lt;/p&gt;
&lt;h2&gt;&lt;strong&gt;Usage&lt;/strong&gt;&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Chess game (all computer games typically) to achieve superhuman performance&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;The feedback on selected strategies is delayed!&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Try things and get feedback. In other words, based on the feedback or interaction, the model is learning.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Training robot to perform tasks in dynamic environment or &lt;strong&gt;building real time recommendations&lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;RL is a method being used to let machines learn know to behave based on interaction with the environment while focusing on some&lt;/strong&gt; &lt;strong&gt;end goal&lt;/strong&gt;**.**&lt;/p&gt;
&lt;h2&gt;&lt;strong&gt;Decision Making Agent (Under RL)&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;Because there are some end goals, so there must be a reward mechanism to move forward to the right direction while in the learning process.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;RL builds a prediction model by gaining feedback from random trial and error and leveraging insight from previous interations.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The cumulative knowledge of how to achieve a specific goal is reinforced again and again by experience.&lt;/p&gt;
&lt;h1&gt;Comments on the course&lt;/h1&gt;
&lt;p&gt;This course was indeed designed for &quot;Absolute Beginners&quot;.&lt;/p&gt;
&lt;p&gt;One suggestion is that the quizzes are too easy, as most simply reiterate the content explained in the course without introducing any variations or ambiguities to challenge the learner.&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Master Feature Scaling &amp; Gradient Descent: Supervised Machine Learning – Day 7</title><link>https://geekcoding101.com/posts/master-feature-scaling-gradient-descent-supervised-machine-learning-day-7</link><guid isPermaLink="true">https://geekcoding101.com/posts/master-feature-scaling-gradient-descent-supervised-machine-learning-day-7</guid><pubDate>Thu, 25 Apr 2024 00:00:00 GMT</pubDate><content:encoded>&lt;h1&gt;Welcome back&lt;/h1&gt;
&lt;p&gt;I didn&apos;t get much time working on the course in past 5 days!!!&lt;/p&gt;
&lt;p&gt;Finally resuming today!&lt;/p&gt;
&lt;p&gt;Today I reviewed &lt;a href=&quot;https://www.coursera.org/learn/machine-learning/lecture/KMDV3/feature-scaling-part-1&quot;&gt;Feature scaling part 1&lt;/a&gt; and learned &lt;a href=&quot;https://www.coursera.org/learn/machine-learning/lecture/akapu/feature-scaling-part-2&quot;&gt;Feature scaling part 2&lt;/a&gt; and &lt;a href=&quot;https://www.coursera.org/learn/machine-learning/lecture/rOTkB/checking-gradient-descent-for-convergence&quot;&gt;Checking gradient descent for convergence&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The difficulty of the course is getter harder, 20mins video, I spent double time and needed to checking external articles to get better understanding.&lt;/p&gt;
&lt;h1&gt;Feature Scaling&lt;/h1&gt;
&lt;p&gt;Trying to understand what is &quot;Feature Scaling&quot;...&lt;/p&gt;
&lt;p&gt;What are features and parameters in below formula?&lt;/p&gt;
&lt;p&gt;hat of Price = w1x1 + w2x2 + b.&lt;/p&gt;
&lt;p&gt;x1 and x2 are features, former one represents size of house, later one represents number of bedrooms.&lt;/p&gt;
&lt;p&gt;w1 and w2 are parameters.&lt;/p&gt;
&lt;p&gt;When a possible range of values of a feature is large, it&apos;s more likely that a good model will learn to choose a relatively small parameter value.&lt;/p&gt;
&lt;p&gt;Likewise, when the possible values of the feature are small, like the number of bedrooms, then a reasonable value for its parameters will be relatively large like 50.&lt;/p&gt;
&lt;p&gt;So how does this relate to grading descent?&lt;/p&gt;
&lt;p&gt;At the end of this video, Andrew explained that the features need to be re-scaled or transformed sl that the cost function J using the transfomed data would shape better and gradient descent can find a much more direct path to the global minimum.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-41-1024x478.png&quot; alt=&quot;Feature Scaling &amp;amp; Gradient Descent&quot; title=&quot;Feature Scaling &amp;amp; Gradient Descent&quot; /&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;When you have different features that take on very different ranges of values, it can cause gradient descent to run slowly but re scaling the different features so they all take on comparable range of values. because speed, upgrade and dissent significantly.&lt;/p&gt;
&lt;p&gt;Andrew Ng&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;One key aspect of feature engineering is scaling, normalization, and standardization, which involves transforming the data to make it more suitable for modeling. These techniques can help to improve model performance, reduce the impact of outliers, and ensure that the data is on the same scale.&lt;/p&gt;
&lt;p&gt;The &lt;a href=&quot;https://www.coursera.org/learn/machine-learning/lecture/akapu/feature-scaling-part-2&quot;&gt;Feature scaling part 2&lt;/a&gt; mentioned why we need to scale. I did some google search and found out &lt;a href=&quot;https://www.analyticsvidhya.com/blog/2020/04/feature-scaling-machine-learning-normalization-standardization/&quot;&gt;Feature Scaling: Engineering, Normalization, and Standardization (Updated 2024)&lt;/a&gt; is really good.&lt;/p&gt;
&lt;p&gt;As a summary of the video, we know a few methods to do scaling:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Divid with the maxium value of features.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Mean Normalization&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Z-score normalization&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;blockquote&gt;
&lt;p&gt;As a rule of thumb, when performing feature scaling, you might want to aim for getting the features to range from maybe anywhere around negative one to somewhere around plus one for each feature x. &lt;br /&gt;
But these values, negative one and plus one can be a little bit loose. &lt;br /&gt;
If the features range from negative three to plus three or negative 0.3 to plus 0.3, all of these are completely okay. &lt;br /&gt;
If you have a feature x_1 that winds up being between zero and three, that&apos;s not a problem. &lt;br /&gt;
You can re-scale it if you want, but if you don&apos;t re-scale it, it should work okay too.&lt;/p&gt;
&lt;p&gt;Andrew Ng&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Too large or small, then should rescale.&lt;/p&gt;
&lt;h1&gt;Checking gradient descent for convergence&lt;/h1&gt;
&lt;p&gt;I learnt that using learning curve of J function to see whether the function is going convergence.&lt;/p&gt;
&lt;p&gt;If graph of function J ever increases after one iteration, that means either Alpha is chosen poorly, and it usually means Alpha is too large, or there could be a bug in the code.&lt;/p&gt;
&lt;p&gt;J of vector w and b should decrease as iteration increases.&lt;/p&gt;
&lt;p&gt;If the curve has flattened out, this means that gradient descent has more or less converged because the curve is no longer decreasing.&lt;/p&gt;
&lt;p&gt;Andrew said he usually found that choosing the right threshold epsilon was pretty difficult. He actually tended to look at learning curve, rather than rely on automatic convergence tests.&lt;/p&gt;
&lt;h1&gt;References&lt;/h1&gt;
&lt;ol&gt;
&lt;li&gt;&lt;a href=&quot;https://www.analyticsvidhya.com/blog/2020/04/feature-scaling-machine-learning-normalization-standardization/&quot;&gt;Feature Scaling: Engineering, Normalization, and Standardization (Updated 2024)&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h1&gt;Terminology&lt;/h1&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Term&lt;/th&gt;
&lt;th&gt;Comments&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Normalization&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mean Normalization&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Z-score normalization&lt;/td&gt;
&lt;td&gt;Involved to calculate &quot;standard deviation σ&quot;.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;standard deviation σ&lt;/td&gt;
&lt;td&gt;It is &lt;strong&gt;a measure of how dispersed the data is in relation to the mean&lt;/strong&gt;. Low, or small, standard deviation indicates data are clustered tightly around the mean, and high, or large, standard deviation indicates data are more spread out.   Or if you&apos;ve heard of the normal distribution or the bell-shaped curve, sometimes also called the Gaussian distribution, this is what the standard deviation for the normal distribution looks like.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Learning curve&lt;/td&gt;
&lt;td&gt;It is difficult to tell in advance how many iterations gradient descent needs to converge, which is why you can create a learning curve.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Automatic Convergence Test&lt;/td&gt;
&lt;td&gt;Another way to decide when your model is done training is with an automatic convergence test.   If the cost J decreases by less than this number epsilon on one iteration, then you&apos;re likely on this flattened part of the curve that you see on the left and you can declare convergence.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Master Learning Rate and Feature Engineering: Supervised Machine Learning – Day 8</title><link>https://geekcoding101.com/posts/master-learning-rate-and-feature-engineering-supervised-machine-learning-day-8</link><guid isPermaLink="true">https://geekcoding101.com/posts/master-learning-rate-and-feature-engineering-supervised-machine-learning-day-8</guid><pubDate>Sat, 27 Apr 2024 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Today I started with &lt;a href=&quot;https://www.coursera.org/learn/machine-learning/lecture/10ZVv/choosing-the-learning-rate&quot;&gt;Choosing the learning rate&lt;/a&gt;, reviewed the Jupyter lab and learnt what is &lt;a href=&quot;https://www.coursera.org/learn/machine-learning/lecture/dgZYR/feature-engineering&quot;&gt;feature engineering&lt;/a&gt;.&lt;/p&gt;
&lt;h1&gt;Choosing the learning rate&lt;/h1&gt;
&lt;p&gt;The graph taugh in &lt;a href=&quot;https://www.coursera.org/learn/machine-learning/lecture/10ZVv/choosing-the-learning-rate&quot;&gt;Choosing the learning rate&lt;/a&gt; is helpful when develping models:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-42-1024x508.png&quot; alt=&quot;feature engineering and gradient descent&quot; title=&quot;feature engineering and gradient descent&quot; /&gt;&lt;/p&gt;
&lt;h2&gt;Feature Engineering&lt;/h2&gt;
&lt;p&gt;When I first started Andrew Ng’s Supervised Machine Learning course, I didn’t really realize how much of an impact feature engineering could have on a model’s performance. But boy, was I in for a surprise! As I worked through the course, I quickly realized that the raw data we start with is rarely good enough for building a great model. Instead, it needs to be transformed, scaled, and cleaned up — that’s where feature engineering comes into play.&lt;/p&gt;
&lt;p&gt;Feature engineering is all about making your data more useful for a machine learning algorithm. Think of it like preparing ingredients for a recipe — the better the quality of your ingredients, the better the final dish will be. Similarly, in machine learning, the features (the input variables) need to be well-prepared to help the algorithm understand patterns more easily. Without this step, even the most powerful algorithms might not perform at their best.&lt;/p&gt;
&lt;p&gt;In the course, Andrew Ng really breaks it down and explains how important feature scaling and transformation are. In one of the early lessons, he used the example of linear regression — a simple algorithm that relies on understanding the relationship between input features and the output. If the features are on vastly different scales, it can throw off the whole process and make training the model take much longer. This was something I had never considered before, but it made so much sense once he explained it.&lt;/p&gt;
&lt;p&gt;I remember struggling a bit with the idea of scaling features at first. Some of the variables in the data might be on completely different scales, like the size of a house in square feet versus the number of bedrooms. Features like the size of a house might have values in the thousands, while the number of bedrooms might only range from 1 to 5. Without scaling, the larger feature would dominate the learning process, and the model would be biased toward it. Learning how to scale these features properly made a huge difference in getting the algorithm to work more efficiently.&lt;/p&gt;
&lt;p&gt;One of the key takeaways from this part of the course was the importance of understanding your data. Andrew Ng emphasizes that feature engineering is not just about applying transformations, but also about using your knowledge of the problem to make the data more relevant. For example, I learned that I could create new features from existing ones. If I had data about the size of a house and the number of bedrooms, I could create a new feature like &quot;bedrooms per square foot,&quot; which might give the model more useful information to work with.&lt;/p&gt;
&lt;p&gt;As the course progressed, I saw how important feature engineering is not just for scaling but also for improving model performance in more complex algorithms like logistic regression and neural networks. It was all about creating features that made it easier for the algorithm to find patterns and make predictions.&lt;/p&gt;
&lt;p&gt;To be honest, at first, I didn’t realize how much of a difference feature engineering could make. But after practicing it through the course, I saw firsthand how tweaking and transforming features could lead to much better results. It felt like I had unlocked a new level in the course, where I wasn’t just feeding data into the algorithm, but actually working with it to make the algorithm smarter. Feature engineering really turned out to be one of the most rewarding parts of learning machine learning, and it’s something I’ll definitely keep honing as I continue on this journey.&lt;/p&gt;
&lt;h1&gt;It&apos;s the end of this learning week?!&lt;/h1&gt;
&lt;p&gt;Didn&apos;t expect that I reached the end of this learning week!&lt;/p&gt;
&lt;p&gt;I didn&apos;t spend much time on the labs. There are three labs in this learning week and I am afaid I might not be ready for the practice lab of this week.&lt;/p&gt;
&lt;p&gt;Will come back to udpate more details after my exercises.&lt;/p&gt;
&lt;p&gt;Ps. feel free to check out &lt;a href=&quot;http://localhost:4321/tags/machine-learning&quot;&gt;my other posts in Supervised Machine Learning Journey&lt;/a&gt;.&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Master Gradient Descent and Binary Classification: Supervised Machine Learning – Day 9</title><link>https://geekcoding101.com/posts/master-gradient-descent-and-binary-classification-supervised-machine-learning-day-9</link><guid isPermaLink="true">https://geekcoding101.com/posts/master-gradient-descent-and-binary-classification-supervised-machine-learning-day-9</guid><pubDate>Thu, 09 May 2024 00:00:00 GMT</pubDate><content:encoded>&lt;h1&gt;A break due to sick&lt;/h1&gt;
&lt;p&gt;Oh boy... I was sick for almost two weeks 🤒 After a brief break, I’m back to dive deep into machine learning, and today, we’ll revisit one of the core concepts in training models—&lt;a href=&quot;https://en.wikipedia.org/wiki/Gradient_descent&quot;&gt;&lt;strong&gt;gradient descent&lt;/strong&gt;&lt;/a&gt;. This optimization technique is essential for minimizing the cost function and finding the optimal parameters for our machine learning models. Whether you&apos;re working with linear regression or more complex algorithms, understanding how gradient descent guides the learning process is key to achieving accurate predictions and efficient model training.&lt;/p&gt;
&lt;p&gt;Let&apos;s dive back into the data-drenched depths where we left off, shall we? 🚀&lt;/p&gt;
&lt;h1&gt;The first coding assessment&lt;/h1&gt;
&lt;p&gt;I couldn&apos;t recall all of the stuff actually. It&apos;s for testing implementation of gradient dscent for one variable linear regression.&lt;/p&gt;
&lt;p&gt;I did a walk through previous lessons and I found this summary is really helpful:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-1024x760.png&quot; alt=&quot;gradient descent&quot; title=&quot;gradient descent&quot; /&gt;&lt;/p&gt;
&lt;p&gt;This exercise enhanced what I&apos;ve learnt about &quot;gradient descent&quot; in this week.&lt;/p&gt;
&lt;h1&gt;Getting into Classification&lt;/h1&gt;
&lt;p&gt;I started the learning of the 3rd week. Looks like it will be more interesting.&lt;/p&gt;
&lt;p&gt;I made a few notes:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;binary classification&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;negative class not mean &quot;bad&quot;, but absense.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;positive class not mean &quot;good&quot;, but presence.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;New english words: benign, malignant&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;logistic regression - Even though, it has &quot;regression&quot; in the name, but it&apos;s for classification.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;threshold&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;sigmoid function or logistic function&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;decision boundary&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;img src=&quot;./image-1-1024x521.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-2.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Probability that y is 1;&lt;br /&gt;
Given input arrow x, parameters arrow w, b.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I couldn&apos;t focus too long on this. Need to pause after watching a few videos.&lt;/p&gt;
&lt;p&gt;Bye now.&lt;/p&gt;
&lt;p&gt;Ps. feel free to check out &lt;a href=&quot;http://localhost:4321/tags/machine-learning&quot;&gt;my other posts in Supervised Machine Learning Journey&lt;/a&gt;.&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Grinding Through Logistic regression: Exploring Supervised Machine Learning – Day 10</title><link>https://geekcoding101.com/posts/grinding-through-logistic-regression-exploring-supervised-machine-learning-day-10</link><guid isPermaLink="true">https://geekcoding101.com/posts/grinding-through-logistic-regression-exploring-supervised-machine-learning-day-10</guid><pubDate>Sat, 11 May 2024 00:00:00 GMT</pubDate><content:encoded>&lt;h1&gt;Let&apos;s continue!&lt;/h1&gt;
&lt;p&gt;Today is mainly learning about &quot;Decision boundary&quot;, &quot;Cost function of logistic regresion&quot;, &quot;Logistic loss&quot; and &quot;Gradient Descent Implementation for logistic regression&quot;.&lt;/p&gt;
&lt;p&gt;We found out the &quot;Decision boundary&quot; is when z equals to 0 in the sigmod function.&lt;/p&gt;
&lt;p&gt;Because at this moment, its value will be just at neutral position.&lt;/p&gt;
&lt;p&gt;Andrew gave an example with two variables, x1 + x2 - 3 (w1 = w2 = 1) the decision bounday is the line of x1 + x2 = 3.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-3-1024x524.png&quot; alt=&quot;decision boundary formula and graph&quot; title=&quot;decision boundary formula and graph&quot; /&gt;&lt;/p&gt;
&lt;h2&gt;I want to say &quot;&lt;a href=&quot;https://www.coursera.org/learn/machine-learning/lecture/0hpr8/cost-function-for-logistic-regression#&quot;&gt;Cost function for logistic regression&lt;/a&gt;&quot; is the most hard in week 3 so far I&apos;ve seen.&lt;/h2&gt;
&lt;p&gt;I haven&apos;t quite figured out why the square error cost function not applicable and where the loss function came from.&lt;/p&gt;
&lt;p&gt;I have to re-watch the videos again.&lt;/p&gt;
&lt;p&gt;The lab is also useful.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-4-1024x495.png&quot; alt=&quot;Logistic regression graph&quot; title=&quot;Logistic regression graph&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-5-1024x310.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-13-1024x472.png&quot; alt=&quot;simplified cost function&quot; title=&quot;simplified cost function&quot; /&gt;&lt;/p&gt;
&lt;p&gt;This particular cost function in above is derived from statistics using a statistical principle called &lt;a href=&quot;https://en.wikipedia.org/wiki/Maximum_likelihood_estimation&quot;&gt;&lt;strong&gt;maximum likelihood estimation&lt;/strong&gt; &lt;strong&gt;(MLE)&lt;/strong&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;h1&gt;Questions and Answers&lt;/h1&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Why do we need loss function?&lt;br /&gt;
Logistic regression requires a cost function more suitable to its non-linear nature. This starts with a Loss function.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Why is the square error cost function not applicable to logistic regression?&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;What is maximum likelihood?&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In logistic regression, &quot;cost&quot; and &quot;loss&quot; have distinct meanings. Which one applies to a single training example?&lt;br /&gt;
A: The term &quot;loss&quot; typically refers to the measure applied to a single training example, while &quot;cost&quot; refers to the average of the loss across the entire dataset or a batch of training examples.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h1&gt;Some thoughts of today&lt;/h1&gt;
&lt;p&gt;Honestly, it feels like it&apos;s getting tougher and tougher.&lt;/p&gt;
&lt;p&gt;I can still get through the equations and derivations alright, it’s just that as I age, I feel like my brain is just not keeping up.&lt;/p&gt;
&lt;p&gt;At the end of each video, Andrew always congratulates me with a big smile, saying I’ve mastered the content of the session.&lt;/p&gt;
&lt;p&gt;But deep down, I really think what he&apos;s actually thinking is, &quot;Ha, got you stumped again!&quot;&lt;/p&gt;
&lt;p&gt;However, to be fair, Andrew really does explain things superbly well.&lt;/p&gt;
&lt;p&gt;I hope someday I can truly master this knowledge and use it effortlessly.&lt;/p&gt;
&lt;p&gt;Fighting!&lt;/p&gt;
&lt;p&gt;Ps. Feel free to check out my other &lt;a href=&quot;http://localhost:4321/tags/machine-learning&quot;&gt;AI Machine Learning Journal blog posts at here&lt;/a&gt;.&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Overfitting! Unlocking the Last Key Concept in Supervised Machine Learning – Day 11, 12</title><link>https://geekcoding101.com/posts/overfitting-unlocking-the-last-key-concept-in-supervised-machine-learning-day-11-12</link><guid isPermaLink="true">https://geekcoding101.com/posts/overfitting-unlocking-the-last-key-concept-in-supervised-machine-learning-day-11-12</guid><pubDate>Mon, 13 May 2024 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;img src=&quot;./minion-woohoo.gif&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;I finished the course!&lt;/p&gt;
&lt;p&gt;I really enjoyed the learning experiences in Andrew&apos;s course so far.&lt;/p&gt;
&lt;p&gt;Let&apos;s see what I&apos;ve learn for the two days!&lt;/p&gt;
&lt;h1&gt;Overfitting - The Last Topic of this Course!&lt;/h1&gt;
&lt;p&gt;&lt;img src=&quot;./data-fitting-1024x420.webp&quot; alt=&quot;overfitting&quot; /&gt;&lt;/p&gt;
&lt;h2&gt;&lt;a href=&quot;https://developers.google.com/machine-learning/crash-course/overfitting/overfitting&quot;&gt;&lt;strong&gt;Overfitting&lt;/strong&gt;&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;It occurs when a machine learning model learns the details and noise in the training data to an extent that it negatively impacts the performance of the model on new data. This means the model is great at predicting or fitting the training data but performs poorly on unseen data, due to its inability to generalize from the training set to the broader population of data.&lt;/p&gt;
&lt;p&gt;The course explains that overfitting can be addressed by:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Reducing model complexity&lt;/strong&gt;: Simplifying the model by selecting one with fewer parameters.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Regularization&lt;/strong&gt;: Adding a regularization term to the loss function, which penalizes large coefficients.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Using more training data&lt;/strong&gt;: More data can help the model learn more generalizable patterns.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We can&apos;t bypass &lt;strong&gt;underfitting&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Overfitting and underfitting&lt;/strong&gt; both are undesirable effects that suggest a model is not well-tuned to the task at hand, but they stem from opposite causes and have different solutions.&lt;/p&gt;
&lt;p&gt;Below two screenshots captured from course for my notes:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-15-1024x498.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-14-1024x514.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;h1&gt;Questions help me to master the content&lt;/h1&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;What is overfitting (aka. high variance)? How to address it?&lt;br /&gt;
Solution to address:&lt;br /&gt;
a. getting more training data.&lt;br /&gt;
b. select features to exclude/include (because more features + insufficient data) will cause overfitting). However, useful features could be lost&lt;br /&gt;
c. Regularization - reduce size of parameters - Encourage the learning algorithm to shrink the values of the parameters without necessarily demanding that the parameter is set to exactly 0.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;What is underfitting?&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;What is ƛ? How would it impact the learning algorithm if choose very large or small value?&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In practice, dose regularizing b make much difference or not?&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Give an example explaining what is preconception (aka. bias or underfit)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;What is generalization?&lt;br /&gt;
You want your learning algorithm to generalize well, which means to make good predictions even on brand new examples that never seen before.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;(The hardest one) Write down all the formulas taught in videos and explain how they could be implemented in python!&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h1&gt;Words From Andrew At The End!&lt;/h1&gt;
&lt;blockquote&gt;
&lt;p&gt;I want to say congratulations on how far you&apos;ve come and I want to say great job for getting through all the way to the end of this video.&lt;/p&gt;
&lt;p&gt;I hope you also work through the practice labs and quizzes.&lt;/p&gt;
&lt;p&gt;Having said that, there are still many more exciting things to learn.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Awesome!&lt;/p&gt;
&lt;p&gt;I am already ready for next machine learning journeys!&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Install Azure-Cli on Mac</title><link>https://geekcoding101.com/posts/install-azure-cli-on-mac</link><guid isPermaLink="true">https://geekcoding101.com/posts/install-azure-cli-on-mac</guid><pubDate>Wed, 12 Jun 2024 00:00:00 GMT</pubDate><content:encoded>&lt;h1&gt;Introduction&lt;/h1&gt;
&lt;p&gt;Are you ready to delve into the exciting realm of Azure AI?&lt;/p&gt;
&lt;p&gt;Whether you&apos;re a seasoned developer or just starting your journey in the world of artificial intelligence, Microsoft Build offers a transformative opportunity to harness the power of AI.&lt;/p&gt;
&lt;p&gt;Recently I came across several good tutorials on Microsoft website, e.g. &quot;&lt;a href=&quot;https://learn.microsoft.com/en-us/training/challenges?id=d1db6d81-f56e-4032-8779-b00a75aa762f&amp;amp;WT.mc_id=cloudskillschallenge_d1db6d81-f56e-4032-8779-b00a75aa762f&amp;amp;ocid=build24_csc_learnpromo_T1_cnl&quot;&gt;CLOUD SKILLS CHALLENGE: Microsoft Build: Build multimodal Generative AI experiences&lt;/a&gt;&quot;.&lt;/p&gt;
&lt;p&gt;I enjoyed the learning on it. But I found out the very first step many people might seem as a challange: get az command work on Mac!&lt;/p&gt;
&lt;p&gt;So I decided to write down all my fix.&lt;/p&gt;
&lt;p&gt;Let&apos;s go!&lt;/p&gt;
&lt;h1&gt;Resolution&lt;/h1&gt;
&lt;p&gt;I am following up &quot;&lt;a href=&quot;https://learn.microsoft.com/en-us/cli/azure/install-azure-cli-macos&quot;&gt;Install Azure on Mac&lt;/a&gt;&quot;.&lt;/p&gt;
&lt;p&gt;Run command:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;brew update &amp;amp;&amp;amp; brew install azure-cli
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;But it failed with permission issue on openssl package:&lt;/p&gt;
&lt;p&gt;&amp;lt;figure&amp;gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-1024x373.png&quot; alt=&quot;Figure 1: install Azure Cli HIT Openssl Permission Issue&quot; title=&quot;Figure 1: install Azure Cli HIT Openssl Permission Issue&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&amp;lt;figcaption&amp;gt;&lt;/p&gt;
&lt;p&gt;Figure 1: OpenSSL Permission Issue&lt;/p&gt;
&lt;p&gt;&amp;lt;/figcaption&amp;gt;&lt;/p&gt;
&lt;p&gt;&amp;lt;/figure&amp;gt;&lt;/p&gt;
&lt;p&gt;I fixed it by changing the permission of &lt;code&gt;/usr/local/lib&lt;/code&gt; to current user, but it&apos;s not enough.&lt;/p&gt;
&lt;p&gt;I hit Python permission issue at a different location:&lt;/p&gt;
&lt;p&gt;&amp;lt;figure&amp;gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-1-1024x192.png&quot; alt=&quot;Figure 2: Python Permission Issue&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&amp;lt;figcaption&amp;gt;&lt;/p&gt;
&lt;p&gt;Figure 2: Python Permission Issue&lt;/p&gt;
&lt;p&gt;&amp;lt;/figcaption&amp;gt;&lt;/p&gt;
&lt;p&gt;&amp;lt;/figure&amp;gt;&lt;/p&gt;
&lt;p&gt;So I had to apply the permission to &lt;code&gt;/usr/local&lt;/code&gt;. So the command is:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;brew unlink python@3.11
sudo chown -R &quot;$USER&quot;:admin /usr/local/
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The screenshot of &lt;code&gt;brew unlink&lt;/code&gt; command:&lt;/p&gt;
&lt;p&gt;&amp;lt;figure&amp;gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-2-1024x69.png&quot; alt=&quot;Figure 3: Brew Unlink Existing Python&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&amp;lt;figcaption&amp;gt;&lt;/p&gt;
&lt;p&gt;Figure 3: Brew Unlink Existing Python&lt;/p&gt;
&lt;p&gt;&amp;lt;/figcaption&amp;gt;&lt;/p&gt;
&lt;p&gt;&amp;lt;/figure&amp;gt;&lt;/p&gt;
&lt;p&gt;Finally it finished installation successfully!&lt;/p&gt;
&lt;p&gt;&amp;lt;figure&amp;gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-3-1024x103.png&quot; alt=&quot;Figure 4: azure-cli Installation Success&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&amp;lt;figcaption&amp;gt;&lt;/p&gt;
&lt;p&gt;Figure 4: Installation Success&lt;/p&gt;
&lt;p&gt;&amp;lt;/figcaption&amp;gt;&lt;/p&gt;
&lt;p&gt;&amp;lt;/figure&amp;gt;&lt;/p&gt;
&lt;p&gt;Well done!&lt;/p&gt;
&lt;p&gt;Ps. You&apos;re welcome to access my other &lt;a href=&quot;http://localhost:4321/tags/daily-ai-insights&quot;&gt;AI Insights blog posts at here&lt;/a&gt;.&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Honored to Pass AI-102!</title><link>https://geekcoding101.com/posts/honored-to-pass-ai-102</link><guid isPermaLink="true">https://geekcoding101.com/posts/honored-to-pass-ai-102</guid><pubDate>Thu, 27 Jun 2024 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;So, I did a thing. I earned my first Microsoft certificate: &lt;strong&gt;Azure AI Engineer Associate&lt;/strong&gt;! 🎉&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./az_102_cert-1024x721.png&quot; alt=&quot;AI-102 certificate&quot; title=&quot;AI-102 certificate&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Here is my story from training to passing the AI-102 exam.&lt;/p&gt;
&lt;h1&gt;The Learning Journey of AI-102&lt;/h1&gt;
&lt;p&gt;The journey began with a four-and-a-half-day company-provided AI-102 training session.&lt;/p&gt;
&lt;p&gt;It was a mix of online classes and labs. I was actually on vacation during this period, so I only managed to focus for about three days, probably.&lt;/p&gt;
&lt;p&gt;The labs provided during the training were very useful.&lt;/p&gt;
&lt;p&gt;There were about 10 labs, each lab could be done up to 10 times, with each session lasting 1 to 3 hours.&lt;/p&gt;
&lt;p&gt;So I didn’t need to pay Microsoft to get familiar with the Azure AI environment.&lt;/p&gt;
&lt;p&gt;Roughly calculation, the training provides 100 to 200 hours of lab time available, but I only used about 20 hours before taking the exam.&lt;/p&gt;
&lt;p&gt;After the AI-102 training, I mainly sticked to &lt;a href=&quot;https://learn.microsoft.com/en-us/training/courses/ai-102t00&quot;&gt;Microsoft Learn: Designing and Implementing a Microsoft Azure AI Solution&lt;/a&gt; to fill gaps.&lt;/p&gt;
&lt;p&gt;Trust me, that&apos;s really helpful! The MS Learn modules helped me understand the concepts better.&lt;/p&gt;
&lt;h2&gt;Cramming and Building Knowledge for AI-102&lt;/h2&gt;
&lt;p&gt;As the exam date got closer, I quickly skimmed &lt;a href=&quot;https://www.youtube.com/watch?v=I7fdWafTcPY&amp;amp;t=71s&quot;&gt;John Savill’s Technical Training videos&lt;/a&gt; on YouTube for one time.&lt;/p&gt;
&lt;p&gt;His videos helped me build a complete knowledge framework in my head. One time is enough for me.&lt;/p&gt;
&lt;p&gt;Last but not least, please do read &lt;a href=&quot;https://areebpasha.notion.site/AI-102-Notes-dd32c9f349bb4e64a0d26ea661ba789c&quot;&gt;Areeb Pasha&apos;s AI-102 notes on Notion&lt;/a&gt;!&lt;/p&gt;
&lt;p&gt;Thanks to &lt;a href=&quot;https://www.reddit.com/user/_areebpasha/&quot;&gt;Areeb Pasha&lt;/a&gt;! It&apos;s so useful.&lt;/p&gt;
&lt;p&gt;These notes were like a concise version of MS Learn and made my studying very efficient.&lt;/p&gt;
&lt;p&gt;I managed to cover all the important points quickly.&lt;/p&gt;
&lt;h1&gt;The Exam Day Experience&lt;/h1&gt;
&lt;p&gt;The exam day finally arrived: June 26, 2024!&lt;/p&gt;
&lt;p&gt;My exam was scheduled at 8:15 AM, so I logged into OneVue 30 minutes early as advised.&lt;/p&gt;
&lt;p&gt;It turned out to be a good idea because there were many things to do before starting the exam (in my case tho).&lt;/p&gt;
&lt;p&gt;First, I had to tidy up my room...&lt;/p&gt;
&lt;p&gt;Then, there was an ID check and a room scan. I had to use my webcam to show every corner of my desk, which was quite tricky.&lt;/p&gt;
&lt;p&gt;Spoiler: all the photos were crooked.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Pro tip:&lt;/strong&gt; No paper and pen are allowed. My proctor made me remove even the white sketch paper and pencil.&lt;/p&gt;
&lt;p&gt;So, it was just my computer, and a bottle of water on the table.&lt;/p&gt;
&lt;h2&gt;Wait! Scheduling Challenges&lt;/h2&gt;
&lt;p&gt;Scheduling the AI-102 exam was a challenge itself. Booking an in-person exam in near by exam center was nearly impossible within three weeks.&lt;/p&gt;
&lt;p&gt;I&apos;ve checked all exam center online, no luck!&lt;/p&gt;
&lt;p&gt;I just checked again, you can feel it:&lt;/p&gt;
&lt;p&gt;&amp;lt;figure&amp;gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./no-slot-1024x532.png&quot; alt=&quot;Figure-1: No exam slot available&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&amp;lt;figcaption&amp;gt;&lt;/p&gt;
&lt;p&gt;Figure-1: No exam slot available&lt;/p&gt;
&lt;p&gt;&amp;lt;/figcaption&amp;gt;&lt;/p&gt;
&lt;p&gt;&amp;lt;/figure&amp;gt;&lt;/p&gt;
&lt;p&gt;It has available slot until October!&lt;/p&gt;
&lt;p&gt;&amp;lt;figure&amp;gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./october-slot-1024x532.png&quot; alt=&quot;Figure 2: Earliest available exam slot found in October 2024&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&amp;lt;figcaption&amp;gt;&lt;/p&gt;
&lt;p&gt;Figure 2: Earliest available exam slot found in October 2024&lt;/p&gt;
&lt;p&gt;&amp;lt;/figcaption&amp;gt;&lt;/p&gt;
&lt;p&gt;&amp;lt;/figure&amp;gt;&lt;/p&gt;
&lt;p&gt;So I switched to an online exam, but the earliest slots were at 5 AM, with only one slot per day.&lt;/p&gt;
&lt;p&gt;After much searching, I found a slot 10 days later. I would be out of town for a couple of days, remember I was still on vacation?!&lt;/p&gt;
&lt;p&gt;So the timeline is like, I joined 4 and half training, then schedule the exam, then went on vacation for several days, then spent three days of cramming after came back.&lt;/p&gt;
&lt;h1&gt;The Moment of Truth&lt;/h1&gt;
&lt;p&gt;When I started the exam, the first question was a massive use case. It was like getting hit with a wall of text.&lt;/p&gt;
&lt;p&gt;I managed to get through it, and then there were some easy questions that helped me relax 😌.&lt;/p&gt;
&lt;p&gt;Time management was a big issue. Because I used MS Learn too much to double check my answers and search answers...&lt;/p&gt;
&lt;p&gt;Did I mention that you can use MS Learn during exam AI-102? Yes! But it&apos;s not that helpful, MS Learn is not ChatGPT, you will get overwhelmed.&lt;/p&gt;
&lt;p&gt;So it&apos;s helpful to check facts if you know where to find the required information of the problem.&lt;/p&gt;
&lt;p&gt;I was in a panic with 6 minutes left and 4 questions to go.&lt;/p&gt;
&lt;p&gt;I quickly selected answers for those questinos.&lt;/p&gt;
&lt;p&gt;With a mix of excitement and nerves, I hit that &quot;submit&quot; button like it was the launch of a rocket.&lt;/p&gt;
&lt;h1&gt;The Sweet Victory&lt;/h1&gt;
&lt;p&gt;Drumroll, please… 768! I passed!&lt;/p&gt;
&lt;p&gt;It was a close call, but I did it. The journey was a mix of hard work, a lot of stress, and a bit of fun.&lt;/p&gt;
&lt;p&gt;I’m glad I did it.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;So there you have it—a glimpse into my adventure of becoming a &lt;strong&gt;Microsoft Certified: Azure AI Engineer Associate (AI-102)&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Looking back, I started my AI learning journey since last year&apos;s hackathon event in my company. I built a RAG chatbot and integrated it into company&apos;s product portal and garnered huge attraction from leadership team. Then I deep dived into LLM/Agent/RAG technologies and keep advancing my hackathon project! Then I worked hard for around 30 hours and passed &lt;a href=&quot;/genai/machine-learning/supervised-machine-learning-day-11-12/&quot;&gt;Andrew Ng&apos;s Supervised Machine Learning&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I am eager to apply my newly acquired skills and knowledge to further innovate and contribute to the field of AI and cloud services.&lt;/p&gt;
&lt;p&gt;If you&apos;re on a similar path, keep learning, stay focused, and you will be there as well.&lt;/p&gt;
&lt;p&gt;Good luck!&lt;/p&gt;
&lt;p&gt;#MicrosoftCertified #AzureAI #ExamJourney #TechLife #NeverStopLearning&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Instantly Remove Duplicate Photos With A Handy Script</title><link>https://geekcoding101.com/posts/instantly-remove-duplicate-photos-with-a-handy-script</link><guid isPermaLink="true">https://geekcoding101.com/posts/instantly-remove-duplicate-photos-with-a-handy-script</guid><pubDate>Mon, 02 Dec 2024 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;:::info&lt;/p&gt;
&lt;p&gt;Thanksgiving usually brings memories of food, family, and laughter. For me, this year added an unexpected twist: cleaning up a &lt;em&gt;massive&lt;/em&gt; library of duplicate photos stored on my WD NAS. What started as a manual chore turned into a tech-fueled triumph, thanks to the power of large language models (LLMs) like ChatGPT.&lt;/p&gt;
&lt;p&gt;This is the story of how I turned a frustrating task (remove duplicate photos) into an automated solution—and how AI transformed me from a frustrated photo hoarder into a digital decluttering superhero.&lt;/p&gt;
&lt;p&gt;:::&lt;/p&gt;
&lt;h1&gt;&lt;strong&gt;The Problem: Too Much Dust on Old Photos, I need &quot;Remove Duplicate Photos&quot; cleaner&lt;/strong&gt;&lt;/h1&gt;
&lt;p&gt;Imagine sifting through &lt;strong&gt;tens of thousands of photos&lt;/strong&gt;—manually. I mounted the NAS SMB partition on my MacBook, only to discover it was excruciatingly slow. After two days of copying files to my MacBook, my manual review session turned into a blur. My eyes hurt, my patience wore thin, and I knew there had to be a better way.&lt;/p&gt;
&lt;p&gt;When I turned to existing tools for &quot;remove duplicate photo&quot; task, I hit a wall. Most were paid, overly complex, or simply didn’t fit my needs. Even the so-called free solutions required learning arcane commands like &lt;code&gt;find&lt;/code&gt;. I needed something powerful, flexible, and fast. And when all else fails, what’s a tech enthusiast to do? Write their own solution—with a &quot;little&quot; help from ChatGPT.&lt;/p&gt;
&lt;h1&gt;&lt;strong&gt;The Power of ChatGPT&lt;/strong&gt;&lt;/h1&gt;
&lt;p&gt;I’d dabbled with the same task scripting years ago but quickly gave up because of the time it required. Enter ChatGPT (no marketing here... I am a paid user though...), the real hero of this story. With its assistance, I wrote the majority of the script in less than a day before i gave up !&lt;/p&gt;
&lt;p&gt;:::warning&lt;/p&gt;
&lt;p&gt;I originally thought I could finish this coding script in just two hours with the help of ChatGPT, but... I ended up spending almost the entire day on it! So all those online posts claiming you can make an iOS app in two hours? They should be reported straight away!&lt;/p&gt;
&lt;p&gt;:::&lt;/p&gt;
&lt;p&gt;But anyway, of course, I still have to thank the emergence of Large Language Models! Based on the current code volume and quality, without 10 to 15 days, a single person would absolutely not be able to achieve the current results! So, I believe LLMs have helped me improve my efficiency by at least 10 times! And they&apos;ve helped me avoid all sorts of unnecessary detours!&lt;/p&gt;
&lt;p&gt;So now, I&apos;ve create &lt;a href=&quot;https://github.com/geekcoding101/get_rid_of_dup&quot;&gt;&lt;code&gt;get_rid_of_dup.py(Clickme)&lt;/code&gt;&lt;/a&gt;,&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://github.com/geekcoding101/get_rid_of_dup&quot;&gt;&lt;img src=&quot;./How-I-Automated-My-Photo-Cleanup.webp&quot; alt=&quot;remove duplicate photos github repo&quot; title=&quot;remove duplicate photos github repo&quot; /&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;a Python-based command-line tool designed to find and remove duplicate files. The entire experience was a testament to how LLMs have redefined productivity for engineers and non-coders alike. Today, LLMs don&apos;t just help you write code; they make you feel like a superhero with a cape woven from AI-driven efficiency.&lt;/p&gt;
&lt;h1&gt;&lt;strong&gt;How the Script &quot;&lt;strong&gt;&lt;strong&gt;Remove Duplicate Photos&lt;/strong&gt;&lt;/strong&gt;&quot; Works&lt;/strong&gt;&lt;/h1&gt;
&lt;p&gt;The remove duplicate photo script operates in two powerful modes:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Single Directory Duplicate Detection&lt;/strong&gt; Quickly finds duplicates within the same folder, using a simple one-command setup. Example:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;python get_rid_of_dup.py dedup --base-dir ./photos --max-width 50 --verbose
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./How-I-Automated-My-Photo-Cleanup-05.webp&quot; alt=&quot;remove duplicate photos runtime screenshot 01&quot; title=&quot;remove duplicate photos runtime screenshot 01&quot; /&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Cross-Directory Duplicate Detection&lt;/strong&gt; Compare files across two directories, using one as a base directory while cleaning the duplicates in the other. This mode ensures that your originals remain untouched. Example:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;python get_rid_of_dup.py search --base-dir ./test ./others --max-width 50 --verbose --exclude &quot;*.DS_Store&quot;
python get_rid_of_dup.py checksum --base-dir ./originals ./backup
python get_rid_of_dup.py delete --base-dir ./originals ./backup
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./How-I-Automated-My-Photo-Cleanup-04.webp&quot; alt=&quot;remove duplicate photos runtime screenshot 02&quot; title=&quot;remove duplicate photos runtime screenshot 02&quot; /&gt; &lt;img src=&quot;./How-I-Automated-My-Photo-Cleanup-02-1024x175.webp&quot; alt=&quot;remove duplicate photos runtime screenshot 03&quot; title=&quot;remove duplicate photos runtime screenshot 03&quot; /&gt; &lt;img src=&quot;./How-I-Automated-My-Photo-Cleanup-03-1024x377.webp&quot; alt=&quot;remove duplicate photos runtime screenshot 04&quot; title=&quot;remove duplicate photos runtime screenshot 04&quot; /&gt; Under the hood, the remove duplicate photo script uses checksum comparisons (via the &lt;code&gt;xxhash&lt;/code&gt; library) to identify duplicates with lightning speed. It can also save checksum data for reuse, making subsequent runs exponentially faster.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;/p&gt;
&lt;h2&gt;&lt;strong&gt;A Few Things to Highlight about &quot;Remove Duplicate Photos&quot;&lt;/strong&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Performance&lt;/strong&gt;: Scanning 30,000+ files (including large images) took under a minute. That’s faster than it takes me to make coffee.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Flexibility&lt;/strong&gt;: Features like &lt;code&gt;--skip-existing&lt;/code&gt; and &lt;code&gt;--verbose&lt;/code&gt; make the tool adaptable to different workflows.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Practical Design Choices&lt;/strong&gt;: For example, in single-directory mode, the script selects the file with the shortest name as the original, ensuring clean and logical results.&lt;/li&gt;
&lt;/ul&gt;
&lt;h1&gt;&lt;strong&gt;Reflecting&lt;/strong&gt;&lt;/h1&gt;
&lt;p&gt;Reflecting on this experience, it’s clear that LLMs like ChatGPT are redefining productivity.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Empowering Coders and Non-Coders Alike&lt;/strong&gt; ChatGPT doesn’t just write code—it &lt;em&gt;teaches&lt;/em&gt;. For non-coders, it demystifies programming. For seasoned developers, it accelerates workflow and sparks new ideas.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Making the Impossible, Possible&lt;/strong&gt; Tasks I once considered “too complex” to script suddenly became doable. With ChatGPT’s guidance, I tackled nuanced logic, performance tuning, and error handling in record time.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Turning Good Engineers Into Great Ones&lt;/strong&gt; LLMs are like an extension of your brain. They handle repetitive tasks, suggest improvements, and help you focus on the creative aspects of problem-solving.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;As I watched this project come together, I couldn’t help but feel a deep sense of gratitude—not just for solving my duplicate photo problem, but for living in an era where tools like ChatGPT exist. From now on, removing duplicate photo is just a piece of cake.&lt;/p&gt;
&lt;h1&gt;&lt;strong&gt;Ready to Declutter Your Files&lt;/strong&gt;&lt;/h1&gt;
&lt;p&gt;The script is open-source and ready to use. Head over to my GitHub to get started: &lt;a href=&quot;https://github.com/geekcoding101/get_rid_of_dup&quot;&gt;&lt;code&gt;get_rid_of_dup.py&lt;/code&gt;&lt;/a&gt;. Here’s a quick summary of what it can do:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Search for duplicates&lt;/strong&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;python get_rid_of_dup.py search --base-dir ./photos ./comparison --max-width 100
&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Generate and save checksums&lt;/strong&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;python get_rid_of_dup.py checksum --base-dir ./photos ./backup
&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;strong&gt;Delete duplicates safely:&lt;/strong&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;python get_rid_of_dup.py delete --base-dir ./photos ./backup
&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h1&gt;&lt;strong&gt;Conclusion&lt;/strong&gt;&lt;/h1&gt;
&lt;p&gt;This Thanksgiving, I walked away with more than just turkey leftovers. I gained a clean photo library with my remove duplicate photos script, a newfound appreciation for automation, and a deeper respect for what AI can achieve.&lt;/p&gt;
&lt;p&gt;If you’re dealing with file clutter—or any repetitive task—let ChatGPT and Python be your allies. Trust me, they’ll turn a daunting chore into a satisfying win.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;And who knows? Your next big idea might just be an LLM-powered breakthrough waiting to happen.&lt;/p&gt;
&lt;p&gt;If you&apos;ve missed the link of my github repositoy, here you go: &lt;a href=&quot;https://github.com/geekcoding101/get_rid_of_dup&quot;&gt;https://github.com/geekcoding101/get_rid_of_dup&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Why is the Transformer Model Called an &quot;AI Revolution&quot;?</title><link>https://geekcoding101.com/posts/why-is-the-transformer-model-called-an-ai-revolution</link><guid isPermaLink="true">https://geekcoding101.com/posts/why-is-the-transformer-model-called-an-ai-revolution</guid><pubDate>Tue, 03 Dec 2024 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;:::info&lt;/p&gt;
&lt;p&gt;Hello, and welcome to the first edition of &lt;strong&gt;Daily AI Insight&lt;/strong&gt;, a series dedicated to unraveling the fascinating world of artificial intelligence, one bite-sized topic at a time.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Daily AI Insight&lt;/strong&gt; aims to bridge that gap. Every post will break down a key concept, a research trend, or a real-world application of AI into digestible, easy-to-understand insights. Whether you&apos;re an AI enthusiast, a professional looking to integrate AI into your work, or just curious about what all the fuss is about, this series is for you.&lt;/p&gt;
&lt;p&gt;:::&lt;/p&gt;
&lt;h3&gt;&lt;strong&gt;1. What is the Transformer?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;The Transformer is a deep learning architecture introduced by Google Research in 2017 through the seminal paper &lt;em&gt;Attention is All You Need&lt;/em&gt;. Originally designed to tackle challenges in natural language processing (NLP), it has since transformed into the foundation for state-of-the-art AI models in multiple domains, such as computer vision, speech processing, and multimodal learning.&lt;/p&gt;
&lt;p&gt;Traditional NLP models like RNNs (Recurrent Neural Networks) and LSTMs (Long Short-Term Memory networks) had two significant shortcomings:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Sequential Processing&lt;/strong&gt;: These models processed text one token at a time, slowing down computations and making it hard to parallelize.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Difficulty Capturing Long-Range Dependencies&lt;/strong&gt;: For long sentences or documents, these models often lost crucial contextual information from earlier parts of the input.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The Transformer introduced a novel &lt;strong&gt;Self-Attention Mechanism&lt;/strong&gt;, enabling it to process entire input sequences simultaneously and focus dynamically on the most relevant parts of the sequence. Think of it like giving the model a panoramic lens, allowing it to view the entire context at once, rather than just focusing on one word at a time.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./The-Structure-of-the-Transformer.webp&quot; alt=&quot;The Structure of the Transformer&quot; title=&quot;The Structure of the Transformer&quot; /&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;2. Why is the Transformer Important?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;The Transformer brought a paradigm shift to AI, fundamentally altering how models process, understand, and generate information. Here&apos;s why it’s considered revolutionary:&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(1) Parallel Processing&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Unlike RNNs that process data step by step, Transformers can analyze all parts of the input sequence simultaneously. This parallelism significantly speeds up training and inference, making it feasible to train models on massive datasets.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(2) Better Understanding of Context&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;The Self-Attention Mechanism enables the Transformer to capture relationships between all tokens in a sequence. For example, in the sentence: “Although it was raining, she decided to go for a run,” the word &quot;although&quot; and &quot;decided&quot; are closely related, even though they’re separated by other words. Transformers excel at identifying and using such relationships.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(3) Scalability&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;The modular architecture of the Transformer makes it easy to scale. This is why the largest AI models today, like OpenAI&apos;s GPT series, Google’s BERT, and other LLMs (Large Language Models), all stem from this architecture.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;3. How Does the Transformer Work?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;The Transformer is built around two core components: &lt;strong&gt;Encoders&lt;/strong&gt; and &lt;strong&gt;Decoders&lt;/strong&gt;. Here’s how they function:&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(1) The Encoder&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;The encoder processes the input sequence, such as a sentence, and transforms it into a series of rich, context-aware vector representations. For instance, in a translation task, the encoder might analyze the sentence “I love programming” and create numerical embeddings for each word, capturing their meaning and relationships.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(2) The Decoder&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;The decoder takes the encoder’s output and generates the target sequence. For translation, it might turn the encoded representations into a sentence like “J’aime programmer” in French.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;Self-Attention Mechanism in Action&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;The heart of the Transformer lies in self-attention, which allows the model to compute the importance of each word relative to every other word in a sequence. For instance, when processing “I love programming,” the word “love” has strong ties to “programming,” which the attention mechanism identifies and weighs heavily during computations.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;4. Key Applications and Models&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;The success of the Transformer architecture has led to the development of many groundbreaking AI models across different domains:&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(1) Natural Language Processing&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;BERT (Bidirectional Encoder Representations from Transformers)&lt;/strong&gt;: A Google model designed for understanding the meaning of text in context, widely used for search engines, question answering, and sentiment analysis.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;GPT Series (Generative Pre-trained Transformers)&lt;/strong&gt;: OpenAI’s series of models, from GPT-2 to GPT-4, excel in text generation, from creative writing to code completion.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;(2) Computer Vision&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Vision Transformer (ViT)&lt;/strong&gt;: Adapts the Transformer architecture for image recognition tasks, segmenting an image into patches and applying self-attention to understand relationships between different parts of the image.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;(3) Multimodal AI&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Models like &lt;strong&gt;CLIP&lt;/strong&gt; and &lt;strong&gt;DALL-E&lt;/strong&gt; use Transformers to handle text and image inputs, enabling AI to generate art from text descriptions or describe images in natural language.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;5. Advantages of the Transformer&lt;/strong&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Efficiency&lt;/strong&gt;: Parallel processing dramatically reduces training time.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Versatility&lt;/strong&gt;: Adaptable to various tasks beyond NLP, such as computer vision and multimodal applications.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Scalability&lt;/strong&gt;: Easy to scale up for training large models on massive datasets.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;6. Challenges and Limitations&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Despite its advantages, the Transformer is not without drawbacks:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;High Computational Costs&lt;/strong&gt;: Training Transformers, especially large-scale ones like GPT-4, requires enormous computational resources and specialized hardware like GPUs or TPUs.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Data-Hungry&lt;/strong&gt;: Transformers need vast amounts of labeled data for training, making them inaccessible for smaller organizations or domains with limited data availability.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Lack of Interpretability&lt;/strong&gt;: While the self-attention mechanism provides flexibility, the inner workings of Transformers remain a “black box,” posing challenges for applications like healthcare and legal systems where decisions need to be transparent.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;7. Transformative Impact&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;The Transformer has reshaped AI research and applications, enabling breakthroughs in natural language understanding, image recognition, and generative AI. It’s the foundation for innovations like ChatGPT, automated translation, and content creation tools.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;8. One-Line Summary&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;The Transformer revolutionized AI with its self-attention mechanism and scalability, making it the cornerstone of modern AI and driving advancements across multiple domains.&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>7 Key Insights on the Self-Attention Mechanism in AI Magic</title><link>https://geekcoding101.com/posts/the-self-attention-in-ai-magic</link><guid isPermaLink="true">https://geekcoding101.com/posts/the-self-attention-in-ai-magic</guid><pubDate>Wed, 04 Dec 2024 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;[ez-toc]&lt;/p&gt;
&lt;p&gt;&quot;Self Attention&quot;, a pivotal advancement in deep learning, is at the core of the Transformer architecture, revolutionizing how models process and understand sequences. Unlike traditional Attention, which focuses on mapping relationships between separate input and output sequences, Self-Attention enables each element within a sequence to interact dynamically with every other element. This mechanism allows AI models to capture long-range dependencies more effectively than previous architectures like RNNs and LSTMs. By computing relevance scores between words in a sentence, Self-Attention ensures that key relationships—such as pronoun references or contextual meanings—are accurately identified, leading to more sophisticated language understanding and generation.&lt;/p&gt;
&lt;h1&gt;&lt;strong&gt;1. The Origin of the Attention Mechanism&lt;/strong&gt;&lt;/h1&gt;
&lt;p&gt;The Attention Mechanism is one of the most transformative innovations in deep learning. First introduced in the 2014 paper &lt;a href=&quot;https://arxiv.org/abs/1409.0473&quot;&gt;&lt;em&gt;Neural Machine Translation by Jointly Learning to Align and Translate&lt;/em&gt;&lt;/a&gt;, it was designed to address a critical challenge: how can a model effectively focus on the most relevant parts of input data, especially in tasks involving long sequences?&lt;/p&gt;
&lt;p&gt;Simply put, the Attention Mechanism allows models to “prioritize,” much like humans skip unimportant details when reading and focus on the key elements. This breakthrough marks a shift in AI from rote memorization to dynamic understanding.&lt;/p&gt;
&lt;hr /&gt;
&lt;h1&gt;&lt;strong&gt;2. The Core Idea Behind the Attention Mechanism&lt;/strong&gt;&lt;/h1&gt;
&lt;p&gt;The Attention Mechanism’s main idea is simple yet powerful: it enables the model to assign different levels of importance to different parts of the input data. Each part of the sequence is assigned a weight, with higher weights indicating greater relevance to the task at hand.&lt;/p&gt;
&lt;p&gt;For example, when translating the sentence “I love cats,” the model needs to recognize that the relationship between &quot;love&quot; and &quot;cats&quot; is more critical than that between &quot;I&quot; and &quot;cats.&quot; The Attention Mechanism dynamically computes these relationships and helps the model focus accordingly.&lt;/p&gt;
&lt;h2&gt;&lt;strong&gt;How It Works (Simplified)&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;Here’s how the Attention Mechanism operates in three key steps:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Relevance Scoring&lt;/strong&gt; Each element of the input sequence is compared against the rest to compute a “relevance score.”&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Weight Normalization&lt;/strong&gt; These scores are converted into probabilities using a Softmax function, ensuring all weights sum to 1.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Weighted Summation&lt;/strong&gt; The weights are then used to compute a new “context vector” that emphasizes the most relevant parts of the input.&lt;/li&gt;
&lt;/ol&gt;
&lt;hr /&gt;
&lt;h1&gt;&lt;strong&gt;3. The Magic of Self-Attention&lt;/strong&gt;&lt;/h1&gt;
&lt;p&gt;Self-Attention, a variant of the Attention Mechanism, lies at the heart of the Transformer architecture. Unlike traditional Attention that focuses on external target sequences (e.g., translating between languages), Self-Attention allows every element in a sequence to interact with every other element within the same sequence. This enhances the model&apos;s ability to understand global relationships.&lt;/p&gt;
&lt;h2&gt;&lt;strong&gt;Example: Sentence Understanding With Self Attention&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;Consider the sentence: &lt;em&gt;“I bought a book yesterday. It is fascinating.”&lt;/em&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A model with Self-Attention can identify that the word &quot;it&quot; refers to &quot;book,&quot; not &quot;yesterday&quot; or &quot;I.&quot;&lt;/li&gt;
&lt;li&gt;It does this by calculating the relevance between &quot;it&quot; and all other words, assigning the highest weight to &quot;book.&quot;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This ability to dynamically analyze relationships is a significant improvement over traditional RNNs and LSTMs, which struggle with such long-range dependencies.&lt;/p&gt;
&lt;hr /&gt;
&lt;h1&gt;&lt;strong&gt;4. Real-World Applications of the Attention Mechanism&lt;/strong&gt;&lt;/h1&gt;
&lt;p&gt;The Attention Mechanism has proven invaluable across a wide range of AI tasks. Here are some of its most impactful applications:&lt;/p&gt;
&lt;h2&gt;&lt;strong&gt;(1) Machine Translation&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;In neural machine translation, the Attention Mechanism dynamically focuses on relevant parts of the source sentence, allowing for more accurate and fluent translations.&lt;/p&gt;
&lt;h2&gt;&lt;strong&gt;(2) Large Language Models&lt;/strong&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Transformer Architecture&lt;/strong&gt;: Attention is the backbone of Transformer models, powering both the encoder and decoder components.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;GPT and BERT&lt;/strong&gt;: These models leverage multi-layer Self-Attention to significantly enhance natural language understanding and generation.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;&lt;strong&gt;(3) Computer Vision&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;In computer vision, Attention is utilized in Vision Transformers (ViT). These models divide an image into patches and use Self-Attention to identify relationships between different parts of the image, achieving performance that often surpasses traditional convolutional neural networks (CNNs).&lt;/p&gt;
&lt;h2&gt;&lt;strong&gt;(4) Multimodal Models&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;Multimodal models like CLIP and DALL-E use Attention to process both text and image inputs simultaneously, enabling tasks such as generating artwork from text descriptions or captioning images.&lt;/p&gt;
&lt;hr /&gt;
&lt;h1&gt;&lt;strong&gt;5. Why Is the Attention Mechanism So Powerful?&lt;/strong&gt;&lt;/h1&gt;
&lt;p&gt;The Attention Mechanism is often called “AI magic” because of its remarkable advantages:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Global Understanding&lt;/strong&gt;: By analyzing relationships across the entire sequence, models can comprehend complex contexts.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Handling Long Sequences&lt;/strong&gt;: Traditional models like RNNs struggle with long-distance dependencies, but Attention mechanisms treat all input elements equally, regardless of their position.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Broad Applicability&lt;/strong&gt;: From text to images to multimodal tasks, the Attention Mechanism is versatile and widely adopted.&lt;/li&gt;
&lt;/ol&gt;
&lt;hr /&gt;
&lt;h1&gt;&lt;strong&gt;6. Challenges and Limitations&lt;/strong&gt;&lt;/h1&gt;
&lt;p&gt;While the Attention Mechanism is transformative, it isn’t without its drawbacks:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Computational Cost&lt;/strong&gt;: Calculating relationships between all elements in a sequence requires significant computation, particularly for long sequences.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Scalability&lt;/strong&gt;: The quadratic complexity (O(n²)) of Self-Attention poses challenges for tasks involving very large inputs, though ongoing research is addressing this issue.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h1&gt;&lt;strong&gt;7. The Impact of Attention: From Focus to Revolution&lt;/strong&gt;&lt;/h1&gt;
&lt;p&gt;The Attention Mechanism represents a paradigm shift in AI, enabling models to focus dynamically on the most relevant information. By solving key challenges in sequence modeling and understanding, it has paved the way for groundbreaking architectures like the Transformer and applications across diverse domains.&lt;/p&gt;
&lt;hr /&gt;
&lt;h1&gt;&lt;strong&gt;8. One-Line Summary&lt;/strong&gt;&lt;/h1&gt;
&lt;p&gt;Thanks for being with me on the journey of &quot;Self Attention Mechanism&quot;!&lt;/p&gt;
&lt;p&gt;The Attention Mechanism empowers AI with the ability to “prioritize” making it an indispensable tool for understanding, generating, and analyzing complex data.&lt;/p&gt;
&lt;p&gt;You&apos;re welcome to access my other &lt;a href=&quot;http://localhost:4321/tags/daily-ai-insights&quot;&gt;AI Insights blog posts at here&lt;/a&gt;.&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>What Are Parameters? Why Are “Bigger” Models Often “Smarter”?</title><link>https://geekcoding101.com/posts/what-are-parameters-why-are-bigger-models-often-smarter</link><guid isPermaLink="true">https://geekcoding101.com/posts/what-are-parameters-why-are-bigger-models-often-smarter</guid><pubDate>Thu, 05 Dec 2024 00:00:00 GMT</pubDate><content:encoded>&lt;h3&gt;&lt;strong&gt;1. What Are Parameters?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;In deep learning, &lt;strong&gt;parameters&lt;/strong&gt; are the trainable components of a model, such as weights and biases, which determine how the model responds to input data. These parameters adjust during training to minimize errors and optimize the model&apos;s performance. &lt;strong&gt;Parameter count&lt;/strong&gt; refers to the total number of such weights and biases in a model.&lt;/p&gt;
&lt;p&gt;Think of parameters as the “brain capacity” of an AI model. &lt;strong&gt;The more parameters it has, the more information it can store and process.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;A simple linear regression model might only have a few parameters, such as weights ( ww&lt;/p&gt;
&lt;p&gt;w) and a bias ( bb&lt;/p&gt;
&lt;p&gt;b).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;GPT-3, a massive language model, boasts &lt;strong&gt;175 billion parameters&lt;/strong&gt;, requiring immense computational resources and data to train.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src=&quot;./issue-3-parameters.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;2. The Relationship Between Parameter Count and Model Performance&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;In deep learning, there is often a positive correlation between a model&apos;s parameter count and its performance. This phenomenon is summarized by &lt;strong&gt;Scaling Laws&lt;/strong&gt;, which show that as parameters, data, and computational resources increase, so does the model&apos;s ability to perform complex tasks.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;Why Are Bigger Models Often Smarter?&lt;/strong&gt;&lt;/h4&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Higher Expressive Power&lt;/strong&gt; Larger models can capture more complex patterns and features in data. For instance, they not only grasp basic grammar but also understand deep semantic and contextual nuances.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Stronger Generalization&lt;/strong&gt; With sufficient training data, larger models generalize better to unseen scenarios, such as answering novel questions or reasoning about unfamiliar topics.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Versatility&lt;/strong&gt; Bigger models can handle multiple tasks with minimal or no additional training. For example, OpenAI&apos;s GPT models excel in creative writing, code generation, translation, and logical reasoning.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;However, &lt;strong&gt;bigger isn’t always better.&lt;/strong&gt; If the parameter count exceeds the amount of data or the complexity of the task, the model may become overly complex and prone to overfitting.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;3. The Practical Significance of Parameter Count&lt;/strong&gt;&lt;/h3&gt;
&lt;h4&gt;&lt;strong&gt;Language Models at Scale&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Here’s a comparison of parameter counts for well-known models:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;GPT-2: 1.5 billion parameters&lt;/li&gt;
&lt;li&gt;GPT-3: 175 billion parameters&lt;/li&gt;
&lt;li&gt;GPT-4: 1.7 trillion parameters.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;As parameter counts grow, these models have demonstrated remarkable improvements in:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Text fluency&lt;/strong&gt;: Generating coherent and contextually appropriate responses.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Reasoning&lt;/strong&gt;: Solving logical puzzles or providing detailed explanations.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Creativity&lt;/strong&gt;: Writing essays, poetry, and even code snippets.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;In Computer Vision&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Parameter count is equally significant in image recognition. For instance:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;ResNet&lt;/strong&gt;: Early versions had a few million parameters.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Vision Transformers (ViT)&lt;/strong&gt;: These modern architectures often have hundreds of millions of parameters, enabling them to outperform traditional convolutional networks on complex tasks.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;4. Are Bigger Models Always Better?&lt;/strong&gt;&lt;/h3&gt;
&lt;h4&gt;&lt;strong&gt;Advantages of Large Models&lt;/strong&gt;&lt;/h4&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Capture Complex Patterns&lt;/strong&gt;: They can model intricate relationships in data that smaller models might miss.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Task Versatility&lt;/strong&gt;: One large model can handle diverse tasks without needing significant fine-tuning.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Breakthroughs in Performance&lt;/strong&gt;: Larger models often lead to state-of-the-art results across many benchmarks.&lt;/li&gt;
&lt;/ol&gt;
&lt;h4&gt;&lt;strong&gt;Drawbacks of Large Models&lt;/strong&gt;&lt;/h4&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;High Computational Cost&lt;/strong&gt;: Bigger models require immense resources for both training and inference. For example, training GPT-3 reportedly cost millions of dollars in compute time.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Energy Consumption&lt;/strong&gt;: Training large models has a significant environmental impact, as it demands enormous amounts of energy.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Efficiency Issues&lt;/strong&gt;: For certain tasks, smaller, task-specific models may achieve similar results with far fewer resources.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;As a result, choosing the right model size involves balancing &lt;strong&gt;performance gains&lt;/strong&gt; against &lt;strong&gt;computational efficiency&lt;/strong&gt;.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;5. Trends in Parameter Optimization: Big Models vs. Small Models&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Despite the success of large models, recent trends highlight the growing importance of &lt;strong&gt;efficient AI&lt;/strong&gt;:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Parameter Compression&lt;/strong&gt;: Techniques like &lt;strong&gt;knowledge distillation&lt;/strong&gt; and &lt;strong&gt;model pruning&lt;/strong&gt; extract the most valuable knowledge from large models and condense it into smaller, faster models.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Efficient Inference&lt;/strong&gt;: Lightweight models, such as &lt;strong&gt;DistilBERT&lt;/strong&gt;, are designed for mobile devices and embedded systems, making AI more accessible and sustainable.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Task-Specific Optimization&lt;/strong&gt;: Instead of using a massive model for every problem, fine-tuning smaller models for specific tasks often yields better cost-effectiveness.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The likely future of AI involves &lt;strong&gt;large-scale pretraining paired with smaller, fine-tuned deployments&lt;/strong&gt;, combining the strengths of both approaches.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;6. One-Line Summary&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Parameter count represents the &quot;brain capacity&quot; of an AI model. While larger models often excel at complex tasks, balancing size and efficiency is key to sustainable AI development.&lt;/strong&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;Your Thoughts&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Do you think the race for larger models is sustainable, or should the focus shift toward efficiency and accessibility? Share your perspective in the comments below!&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>What Is Prompt Engineering and How to &quot;Train&quot; AI with a Single Sentence?</title><link>https://geekcoding101.com/posts/what-is-prompt-engineering-and-how-to-train-ai-with-a-single-sentence</link><guid isPermaLink="true">https://geekcoding101.com/posts/what-is-prompt-engineering-and-how-to-train-ai-with-a-single-sentence</guid><pubDate>Fri, 06 Dec 2024 00:00:00 GMT</pubDate><content:encoded>&lt;h3&gt;1. What is Prompt Engineering?&lt;/h3&gt;
&lt;p&gt;Prompt Engineering is a core technique in the field of generative AI. Simply put, it involves crafting effective input prompts to guide AI in producing the desired results.&lt;/p&gt;
&lt;p&gt;Generative AI models (like GPT-3 and GPT-4) are essentially predictive tools that generate outputs based on input prompts. The goal of Prompt Engineering is to optimize these inputs to ensure that the AI performs tasks according to user expectations.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Here’s an example&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Input: &lt;em&gt;“Explain quantum mechanics in one sentence.”&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;Output: &lt;em&gt;“Quantum mechanics is a branch of physics that studies the behavior of microscopic particles.”&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The quality of the prompt directly impacts AI performance. A clear and targeted prompt can significantly improve the results generated by the model.&lt;/p&gt;
&lt;h3&gt;2. Why is Prompt Engineering important?&lt;/h3&gt;
&lt;p&gt;The effectiveness of generative AI depends heavily on how users present their questions or tasks. The importance of Prompt Engineering can be seen in the following aspects:&lt;/p&gt;
&lt;h4&gt;(1) &lt;strong&gt;Improving output quality&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;A well-designed prompt reduces the risk of the AI generating incorrect or irrelevant responses. For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Ineffective Prompt: &lt;em&gt;“Write an article about climate change.”&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;Optimized Prompt: &lt;em&gt;“Write a brief 200-word report on the impact of climate change on the Arctic ecosystem.”&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;(2) &lt;strong&gt;Saving time and cost&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;A clear prompt minimizes trial and error, improving efficiency, especially in scenarios requiring large-scale outputs (e.g., generating code or marketing content).&lt;/p&gt;
&lt;h4&gt;(3) &lt;strong&gt;Expanding AI’s use cases&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;With clever prompt design, users can leverage AI for diverse and complex tasks, from answering questions to crafting poetry, generating code, or even performing data analysis.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;3. Core techniques in Prompt Engineering&lt;/h3&gt;
&lt;p&gt;Designing an effective prompt involves several principles and strategies:&lt;/p&gt;
&lt;h4&gt;(1) &lt;strong&gt;Define clear goals&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Prompts should directly target the task at hand. For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Vague Prompt: &lt;em&gt;“Talk about animals.”&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;Clear Prompt: &lt;em&gt;“Describe the behavior of lions in three sentences, including one interesting fact.”&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;(2) &lt;strong&gt;Provide context&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Context helps AI better understand the task. For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Isolated Prompt: &lt;em&gt;“Generate a paragraph about carbon dioxide.”&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;Contextualized Prompt: &lt;em&gt;“Carbon dioxide is a major greenhouse gas contributing to global warming. Based on this, generate a 200-word article.”&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;(3) &lt;strong&gt;Control output style&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;By including descriptive language, users can adjust the AI&apos;s tone or style. For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;General Prompt: &lt;em&gt;“Write a paragraph about cats.”&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;Styled Prompt: &lt;em&gt;“Write a humorous paragraph about why cats are smarter than dogs.”&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;(4) &lt;strong&gt;Iterative refinement&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Prompts can be iteratively improved. Start with an initial output, then refine the prompt to address any shortcomings.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;4. Limitations of Prompt Engineering&lt;/h3&gt;
&lt;p&gt;While Prompt Engineering is highly useful, it has its limitations:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Experience required&lt;/strong&gt;: Designing effective prompts often requires users to understand how AI operates.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Model understanding constraints&lt;/strong&gt;: Even with well-crafted prompts, the AI may still produce errors or misunderstand the task.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Dependence on model versions&lt;/strong&gt;: Responses to prompts can vary significantly between models (e.g., GPT-3 vs. GPT-4).&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h3&gt;5. In one sentence&lt;/h3&gt;
&lt;p&gt;Prompt Engineering is a critical skill in generative AI, allowing users to efficiently and accurately accomplish tasks by optimizing input prompts – truly the “art of communication” with AI.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;Now, as promised, here are some highly recommended books on Prompt Engineering. They&apos;re packed with practical insights to take your skills to the next level:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;strong&gt;Title&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Author&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Published&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Summary&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://www.amazon.com/Art-Prompt-Engineering-chatGPT-Hands/dp/1739296710&quot;&gt;&lt;em&gt;The Art of Prompt Engineering with ChatGPT&lt;/em&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Nathan Hunter&lt;/td&gt;
&lt;td&gt;2024&lt;/td&gt;
&lt;td&gt;A hands-on guide exploring how to use ChatGPT effectively through prompt engineering, with practical techniques to master this art and science.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://www.amazon.com/Prompt-Engineering-Unlocking-Generative-Creative/dp/B0C1J9F65T&quot;&gt;&lt;em&gt;Prompt Engineering: Unlocking Generative AI&lt;/em&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Navveen Balani&lt;/td&gt;
&lt;td&gt;2024&lt;/td&gt;
&lt;td&gt;Focuses on ethical and creative applications of prompt engineering, perfect for those looking to integrate this skill into AI development.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://www.amazon.com/Prompt-Engineering-Generative-AI-Future-Proof/dp/109815343X&quot;&gt;&lt;em&gt;Prompt Engineering for Generative AI&lt;/em&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;James Phoenix, Mike Taylor&lt;/td&gt;
&lt;td&gt;2024&lt;/td&gt;
&lt;td&gt;Offers strategies and tips for designing reliable AI prompts, aimed at developers and engineers optimizing inputs for generative AI models.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://www.amazon.com/Demystifying-Prompt-Engineering-Step-Step/dp/B0C9S3HQXJ&quot;&gt;&lt;em&gt;Demystifying Prompt Engineering&lt;/em&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Harish Bhat&lt;/td&gt;
&lt;td&gt;2024&lt;/td&gt;
&lt;td&gt;Simplifies the complexities of prompt engineering, with step-by-step guides for beginners and AI enthusiasts to create effective prompts.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href=&quot;https://www.amazon.com/Unlocking-Secrets-Prompt-Engineering-generation/dp/1835083838&quot;&gt;&lt;em&gt;Unlocking the Secrets of Prompt Engineering&lt;/em&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Gilbert Mizrahi&lt;/td&gt;
&lt;td&gt;2024&lt;/td&gt;
&lt;td&gt;Delves into the art of prompt engineering with practical techniques, helping readers quickly advance from novice to expert in AI-driven language tasks.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr /&gt;
&lt;p&gt;Another good resource is Prompt Engineering Guide at https://www.promptingguide.ai/!&lt;/p&gt;
&lt;p&gt;It introduced a lot of techniques:&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Technique&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Reference&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Zero-Shot Prompting&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Instructing the model to perform a task without providing examples, relying on its pre-existing knowledge.&lt;/td&gt;
&lt;td&gt;&lt;a href=&quot;https://www.promptingguide.ai/techniques/zeroshot&quot;&gt;Zero-Shot Prompting&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Few-Shot Prompting&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Supplying a few examples within the prompt to guide the model&apos;s behavior and improve performance on specific tasks.&lt;/td&gt;
&lt;td&gt;&lt;a href=&quot;https://www.promptingguide.ai/techniques/fewshot&quot;&gt;Few-Shot Prompting&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Chain-of-Thought Prompting&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Encouraging the model to articulate a step-by-step reasoning process, aiding in complex problem-solving.&lt;/td&gt;
&lt;td&gt;&lt;a href=&quot;https://www.promptingguide.ai/techniques/cot&quot;&gt;Chain-of-Thought Prompting&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Self-Consistency&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Generating multiple reasoning paths and selecting the most consistent answer to enhance accuracy in complex reasoning tasks.&lt;/td&gt;
&lt;td&gt;&lt;a href=&quot;https://www.promptingguide.ai/techniques/consistency&quot;&gt;Self-Consistency&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Generated Knowledge Prompting&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Prompting the model to produce relevant facts before addressing the main task, leveraging its internal knowledge base.&lt;/td&gt;
&lt;td&gt;&lt;a href=&quot;https://www.promptingguide.ai/techniques/knowledge&quot;&gt;Generated Knowledge Prompting&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Prompt Chaining&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Breaking down complex tasks into a series of simpler prompts, allowing the model to tackle each step sequentially.&lt;/td&gt;
&lt;td&gt;&lt;a href=&quot;https://www.promptingguide.ai/techniques/prompt_chaining&quot;&gt;Prompt Chaining&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Tree of Thoughts (ToT)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Extending chain-of-thought by exploring multiple reasoning paths in a tree structure to improve problem-solving.&lt;/td&gt;
&lt;td&gt;&lt;a href=&quot;https://www.promptingguide.ai/techniques/tot&quot;&gt;Tree of Thoughts&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Retrieval-Augmented Generation (RAG)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Combining external knowledge retrieval with generation to provide up-to-date and accurate information.&lt;/td&gt;
&lt;td&gt;Retrieval-Augmented Generation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Automatic Prompt Engineer&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Utilizing models to automatically generate and optimize prompts, reducing manual effort.&lt;/td&gt;
&lt;td&gt;Automatic Prompt Engineer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Active-Prompt&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Engaging the model in an interactive manner to iteratively refine prompts and improve responses.&lt;/td&gt;
&lt;td&gt;Active-Prompt&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Directional Stimulus Prompting&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Guiding the model&apos;s output by providing specific cues or directions within the prompt.&lt;/td&gt;
&lt;td&gt;Directional Stimulus Prompting&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Program-Aided Language Models (PAL)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Integrating programming logic with language models to handle tasks requiring precise computations.&lt;/td&gt;
&lt;td&gt;Program-Aided Language Models&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ReAct&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Combining reasoning and acting by prompting the model to perform actions based on its reasoning process.&lt;/td&gt;
&lt;td&gt;ReAct&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Reflexion&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Encouraging the model to reflect on its responses and iteratively improve them.&lt;/td&gt;
&lt;td&gt;Reflexion&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Multimodal Chain-of-Thought (CoT)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Applying chain-of-thought prompting across multiple modalities, such as text and images.&lt;/td&gt;
&lt;td&gt;Multimodal CoT&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Graph Prompting&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Utilizing graph structures within prompts to represent relationships and enhance understanding.&lt;/td&gt;
&lt;td&gt;Graph Prompting&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;/p&gt;
&lt;h3&gt;Bonus: Is prompt engineering unnecessary with powerful AI models?&lt;/h3&gt;
&lt;p&gt;Even with advanced large language models (LLMs), Prompt Engineering remains crucial.&lt;/p&gt;
&lt;p&gt;The quality of prompt design directly impacts the model&apos;s performance on specific tasks. Well-crafted prompts significantly improve output accuracy and relevance. Prompt Engineering can also help:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Guide complex reasoning&lt;/strong&gt;: It enables the model to perform intricate tasks or solve layered problems.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Reduce hallucinations&lt;/strong&gt;: Proper prompts minimize the chances of the model generating false or irrelevant information.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Improve domain-specific adaptability&lt;/strong&gt;: Tailored prompts ensure better performance in specialized fields.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For further insights, check out the paper &lt;em&gt;“&lt;a href=&quot;https://arxiv.org/abs/2310.14735&quot;&gt;Unleashing the Potential of Prompt Engineering in Large Language Models: A Comprehensive Review&lt;/a&gt;”&lt;/em&gt; on arXiv.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;Alright, that’s all for today! If you enjoyed this or found it helpful, don’t forget to follow me. Let’s keep growing and learning together!&lt;/p&gt;
&lt;p&gt;Goodnight! Dream big, folks…&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Parameters vs. Inference Speed: Why Is Your Phone’s AI Model ‘Slimmer’ Than GPT-4?</title><link>https://geekcoding101.com/posts/parameters-vs-inference-speed-why-is-your-phones-ai-model-slimmer-than-gpt-4</link><guid isPermaLink="true">https://geekcoding101.com/posts/parameters-vs-inference-speed-why-is-your-phones-ai-model-slimmer-than-gpt-4</guid><pubDate>Sat, 07 Dec 2024 00:00:00 GMT</pubDate><content:encoded>&lt;h3&gt;&lt;strong&gt;1. What Are Parameters?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;This was covered in a previous issue: &lt;a href=&quot;/posts/what-are-parameters-why-are-bigger-models-often-smarter&quot;&gt;What Are Parameters? Why Are “Bigger” Models Often “Smarter”?&lt;/a&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;2. The Relationship Between Parameter Count and Inference Speed&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;As the number of parameters in a model increases, it requires more computational resources to perform inference (i.e., generate results). This directly impacts inference speed. However, the relationship between parameters and speed is not a straightforward inverse correlation.&lt;/p&gt;
&lt;p&gt;Several factors influence inference speed:&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(1) Computational Load (FLOPs)&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;The number of floating-point operations (FLOPs) required by a model directly impacts inference time. However, FLOPs are not the sole determinant since different types of operations may execute with varying efficiency on hardware.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(2) Memory Access Cost&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;During inference, the model frequently accesses memory. The volume of memory access (or memory bandwidth requirements) can affect speed. For instance, both the computational load and memory access demands of deep learning models significantly impact deployment and inference performance.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(3) Model Architecture&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;The design of the model, including its parallelism and branching structure, influences efficiency. For example, branched architectures may introduce synchronization overhead, causing some compute units to idle and slowing inference.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(4) Hardware Architecture&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Different hardware setups handle models differently. A device’s computational power, memory bandwidth, and overall architecture all affect inference speed. Efficient neural network designs must balance computational load and memory demands for optimal performance across various hardware environments.&lt;/p&gt;
&lt;p&gt;Thus, while parameter count is one factor affecting inference time, it’s not a simple inverse relationship. Optimizing inference speed requires consideration of computational load, memory access patterns, model architecture, and hardware capabilities.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;3. Why Are AI Models on Phones ‘Slimmer’ Than GPT-4?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;AI models running on phones are heavily compressed and optimized to operate within the resource constraints of mobile devices. Common optimization techniques include:&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(1) Model Quantization&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Quantization reduces the precision of model parameters from high precision (e.g., 32-bit floating-point) to lower precision (e.g., 8-bit integers), thereby reducing memory usage and computational requirements. For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A non-quantized model might require 100GB of memory.&lt;/li&gt;
&lt;li&gt;A quantized version could reduce this to 10GB or less.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;(2) Knowledge Distillation&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;In knowledge distillation, a &quot;large model&quot; teaches a &quot;small model.&quot; The smaller model retains reasonable performance by learning from the large model’s outputs, despite having significantly fewer parameters.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(3) Model Pruning&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Pruning removes redundant parameters in a model. For instance, neurons with minimal contribution to the output can be “pruned” to reduce the model size without significant performance loss.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(4) Optimized Inference Frameworks&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Frameworks like TensorFlow Lite and ONNX are specifically designed for mobile and edge devices, offering performance optimizations to enhance inference efficiency.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;4. Real-Life Examples: GPT-4 vs. Mobile AI&lt;/strong&gt;&lt;/h3&gt;
&lt;h4&gt;&lt;strong&gt;GPT-4&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;GPT-4 is a massive-scale model designed for cloud-based deployment. It relies on powerful GPU clusters and achieves exceptional performance on complex language tasks. However, this comes with high computational and infrastructure costs.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;Mobile AI&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Take, for instance, the quantized version of &lt;strong&gt;LLaMA 2&lt;/strong&gt;, which has been optimized to run locally on high-end smartphones. While it doesn’t match the raw capabilities of cloud-based large models, it is efficient enough to handle common tasks effectively on-device.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;5. Balancing Parameter Count and Inference Speed&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;The relationship between parameter count and inference speed exemplifies a trade-off:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Large models deliver superior performance but are slower and more expensive to run.&lt;/li&gt;
&lt;li&gt;Smaller models are faster and more resource-efficient but lack the capabilities of larger counterparts.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This trade-off depends on the application context:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Cloud Services&lt;/strong&gt;: Prioritize performance by using large-scale models.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Mobile Devices&lt;/strong&gt;: Focus on speed and energy efficiency with lightweight models.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Edge Computing&lt;/strong&gt;: Strive for a balance between performance and efficiency.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;6. One-Line Summary&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;The parameter count of a model defines its potential capabilities, while inference speed is constrained by computational resources and optimization techniques. Mobile AI models achieve “small but mighty” performance through compression and optimization, but the raw power of GPT-4 and similar models still relies on cloud infrastructure.&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Discovering the Joy of Tokens: AI’s Language Magic Unveiled</title><link>https://geekcoding101.com/posts/discovering-the-joy-of-tokens-ais-language-magic-unveiled</link><guid isPermaLink="true">https://geekcoding101.com/posts/discovering-the-joy-of-tokens-ais-language-magic-unveiled</guid><pubDate>Sun, 08 Dec 2024 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Today’s topic might seem a bit technical, but don’t worry—we’re keeping it down-to-earth.&lt;/p&gt;
&lt;p&gt;Let’s uncover the secrets of &lt;strong&gt;tokens&lt;/strong&gt;, the building blocks of AI’s understanding of language.&lt;/p&gt;
&lt;p&gt;If you’ve ever used ChatGPT or similar AI tools, you might have noticed something: when you ask a long question, it takes a bit longer to answer. But short questions? Boom, instant response. That’s all thanks to tokens.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;1. What Are Tokens?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;A &lt;strong&gt;token&lt;/strong&gt; is the smallest unit of language that AI models “understand.” It could be a sentence, a word, a single character, or even part of a word. &lt;strong&gt;In short, AI doesn’t understand human language—but it understands tokens.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Take this sentence as an example:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;“AI is incredibly smart.”&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Depending on the tokenization method, this could be broken down into:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Word-level tokens&lt;/strong&gt;: &lt;code&gt;[&quot;AI&quot;, &quot;is&quot;, &quot;incredibly&quot;, &quot;smart&quot;]&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Character-level tokens&lt;/strong&gt;: &lt;code&gt;[&quot;A&quot;, &quot;I&quot;, &quot; &quot;, &quot;i&quot;, &quot;s&quot;, &quot; &quot;, &quot;i&quot;, &quot;n&quot;, &quot;c&quot;, &quot;r&quot;, &quot;e&quot;, &quot;d&quot;, &quot;i&quot;, &quot;b&quot;, &quot;l&quot;, &quot;y&quot;, &quot; &quot;, &quot;s&quot;, &quot;m&quot;, &quot;a&quot;, &quot;r&quot;, &quot;t&quot;]&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Subword-level tokens (the most common method)&lt;/strong&gt;: &lt;code&gt;[&quot;AI&quot;, &quot;is&quot;, &quot;incred&quot;, &quot;ibly&quot;, &quot;smart&quot;]&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In a nutshell, AI breaks down sentences into manageable pieces to understand our language. Without tokens, AI is like a brain without neurons—completely clueless.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;2. Why Are Tokens So Important?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;AI models aren’t magical—they rely on a logic of &lt;strong&gt;“predicting the next step.”&lt;/strong&gt; Here’s the simplified workflow: you feed in a token, and the model starts “guessing” what’s next. It’s like texting a friend, saying “I’m feeling,” and your friend immediately replies, “tired.” Is it empathy? Nope—it’s just a logical guess based on past interactions.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;Why Does AI Need Tokens?&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Language is complex, and tokens help AI translate it into something math can handle. For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Input&lt;/strong&gt;: “AI is amazing!”&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Tokenized version (just an illustrative example)&lt;/strong&gt;: &lt;code&gt;[1234, 5678, 91011]&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Prediction&lt;/strong&gt;: Based on &lt;code&gt;[1234, 5678]&lt;/code&gt;, the model predicts the next token will be &lt;code&gt;91011&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;3. How Does AI Tokenize? It’s Not Just Random Chopping&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Tokenization isn’t just smashing sentences with a metaphorical hammer. There’s a method to the madness, and it’s pretty sophisticated:&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(1) Word-based Tokenization&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;The simplest method: split the text by spaces. For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Input&lt;/strong&gt;: “AI is awesome.”&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Tokens&lt;/strong&gt;: &lt;code&gt;[&quot;AI&quot;, &quot;is&quot;, &quot;awesome&quot;]&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Pros&lt;/strong&gt;: Fast and straightforward.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cons&lt;/strong&gt;: Fails with punctuation (&lt;code&gt;&quot;awesome!&quot;&lt;/code&gt;) or morphologically complex languages like German.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;(2) Subword-based Tokenization (Most Common Approach)&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;This is the go-to method for modern models like GPT or BERT. For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Input&lt;/strong&gt;: “awesome.”&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Tokens&lt;/strong&gt;: &lt;code&gt;[&quot;awe&quot;, &quot;some&quot;]&lt;/code&gt; Why? It’s great for rare or unknown words. Even if the model hasn’t seen “awesomesauce,” it can still guess its meaning by breaking it into familiar parts like “awe” and “some.”&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;(3) Character-based Tokenization&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Every single character is treated as a token:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Input&lt;/strong&gt;: “GPT”&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Tokens&lt;/strong&gt;: &lt;code&gt;[&quot;G&quot;, &quot;P&quot;, &quot;T&quot;]&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Pros&lt;/strong&gt;: Works for unknown words or typos.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cons&lt;/strong&gt;: Increases the number of tokens drastically, making it computationally expensive.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;(4) Byte Pair Encoding (BPE)&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Despite the fancy name, it’s just a frequency-based approach. The most common character pairs are merged into tokens. For example, the word “the” might appear so frequently that it gets treated as a single token.&lt;/p&gt;
&lt;p&gt;In short: AI tokenization isn’t random; it’s a carefully designed process balancing precision and efficiency.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;4. The Real Impact of Tokens on AI&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Tokens aren’t just technical jargon—they directly affect how well an AI model performs. Here’s how:&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(1) Context Range&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;A model’s token limit determines how much “context” it can remember in one go.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;GPT-3 can handle &lt;strong&gt;4096 tokens&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;GPT-4 extends this to &lt;strong&gt;32,000 tokens&lt;/strong&gt;. &lt;strong&gt;What does this mean?&lt;/strong&gt; With GPT-4, you can feed it a lengthy legal contract, and it can still keep the entire thing in memory while generating output. GPT-3? It’ll probably cut you off halfway, saying, “I forgot what you said earlier.”&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;(2) Generation Quality&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Tokenization affects how smoothly AI generates text. For instance, subword tokenization helps AI recognize that “amazingly” and “amazing” are related, improving its ability to generate coherent content. A less sophisticated tokenizer might not make the connection.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(3) Computational Cost&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Each token adds to the computational workload. This is why AI slows down with longer inputs—more tokens mean more processing, leading to what I like to call “computational fatigue.”&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;5. The Limitations of Tokenization&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;While tokenization is essential, it’s not without its quirks:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Semantic Splitting&lt;/strong&gt;: Breaking “unbelievable” into &lt;code&gt;[&quot;un&quot;, &quot;believ&quot;, &quot;able&quot;]&lt;/code&gt; might make sense mathematically but could dilute the semantic meaning.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Language Diversity&lt;/strong&gt;: Tokenization rules vary widely across languages. What works for English may fail spectacularly for Chinese or Arabic.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Resource Consumption&lt;/strong&gt;: Tokenizing long texts adds overhead, slowing down inference times and increasing computational demand.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src=&quot;./llm-token-limit.png&quot; alt=&quot;llm token limit&quot; title=&quot;llm token limit&quot; /&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;6. One-Line Summary&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Tokens are the building blocks of AI’s language understanding, and tokenization is the bridge that translates human language into math. Without tokens, AI is just a heap of clueless parameters.&lt;/p&gt;
&lt;p&gt;More information: &lt;a href=&quot;https://thenewstack.io/the-building-blocks-of-llms-vectors-tokens-and-embeddings/&quot;&gt;The Building Blocks of LLMs: Vectors, Tokens and Embeddings&lt;/a&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;Final Thoughts&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;AI may seem like “magic” but it’s really all about the details. Next time you’re using ChatGPT, try guessing: how many tokens did my question use? Did it exceed the context window? These “hidden mechanics” play a big role in determining how accurate and useful the AI’s response will be.&lt;/p&gt;
&lt;p&gt;Alright, that’s it for today’s AI dissection! Follow me for more bite-sized insights, and let’s keep uncovering the nuts and bolts of AI together! See you tomorrow.&lt;/p&gt;
&lt;p&gt;Ps. feel free to check out my other posts about &lt;a href=&quot;http://localhost:4321/tags/daily-ai-insights&quot;&gt;Daily AI Insights&lt;/a&gt;.&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Fine-Tuning Models: Unlocking the Extraordinary Potential of AI</title><link>https://geekcoding101.com/posts/fine-tuning-models-unlocking-the-extraordinary-potential-of-ai</link><guid isPermaLink="true">https://geekcoding101.com/posts/fine-tuning-models-unlocking-the-extraordinary-potential-of-ai</guid><pubDate>Mon, 09 Dec 2024 00:00:00 GMT</pubDate><content:encoded>&lt;h3&gt;&lt;strong&gt;1. What Is Fine-Tuning?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;img src=&quot;./fine-tuning.png&quot; alt=&quot;find-tuning&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Fine-tuning is a key process in AI training, where a pre-trained model is further trained on specific data to specialize in a particular task or domain.&lt;/p&gt;
&lt;p&gt;Think of it this way: It is like giving a generalist expert additional training to become a specialist. For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Pre-trained model&lt;/strong&gt;: Knows general knowledge (like basic reading comprehension or common language patterns).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Fine-tuned model&lt;/strong&gt;: Gains expertise in a specific field, such as medical diagnostics, legal analysis, or poetry writing.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;2. Why Is Fine-Tuning Necessary?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Pre-trained models like GPT-4 and BERT are powerful, but they’re built for general-purpose use. Fine-tuning tailors these models for specialized applications. Here’s why it’s important:&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(1) Adapting to Specific Scenarios&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;General-purpose models are like encyclopedias—broad but not deep. Fine-tuning narrows their focus to master specific contexts:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Medical AI&lt;/strong&gt;: Understands specialized terms like &quot;coronary artery disease.&quot;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Legal AI&lt;/strong&gt;: Deciphers complex legal jargon and formats.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;(2) Saving Computational Resources&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Training a model from scratch requires enormous resources. Fine-tuning leverages existing pre-trained knowledge, making the process faster and more cost-effective.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(3) Improving Performance&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;By focusing on domain-specific data, fine-tuned models outperform general models in specialized tasks. They can understand unique patterns and nuances within the target domain.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;3. How Does It Work?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;It typically involves the following steps:&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(1) Selecting a Pre-trained Model&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Choose a pre-trained model, such as GPT, BERT, or similar. These models have already been trained on massive datasets and understand the general structure of language.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(2) Preparing a Specialized Dataset&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Gather a high-quality dataset relevant to your specific task. For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;For legal document generation: A dataset of contracts and case law.&lt;/li&gt;
&lt;li&gt;For medical diagnosis: A dataset of clinical notes and research papers.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;(3) Training the Model&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Train the pre-trained model on your domain-specific dataset, fine-tuning its parameters to optimize performance for your task. This process usually requires only a few training epochs.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(4) Validation and Adjustment&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Test the fine-tuned model on unseen data to evaluate its performance. If necessary, refine the dataset or training process to achieve better results.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;4. Real-Life Applications&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Fine-tuning has revolutionized numerous fields. Here are some examples:&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(1) Medicine&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Goal&lt;/strong&gt;: Develop AI models capable of interpreting medical images or summarizing clinical reports.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Dataset&lt;/strong&gt;: Medical records, radiology images, and research articles.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Outcome&lt;/strong&gt;: A model that understands medical terminology and improves diagnostic accuracy.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;(2) Legal Industry&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Goal&lt;/strong&gt;: Automate the generation of legal documents or analyze case law.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Dataset&lt;/strong&gt;: Legal texts, contracts, and court rulings.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Outcome&lt;/strong&gt;: An AI that produces professional, compliant legal outputs.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;(3) Financial Markets&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Goal&lt;/strong&gt;: Enable AI to analyze financial reports or make investment recommendations.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Dataset&lt;/strong&gt;: Historical stock data and financial statements.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Outcome&lt;/strong&gt;: A system that provides insights tailored to financial decision-making.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;5. Challenges&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;While fine-tuning is a powerful technique, it’s not without limitations:&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(1) Overfitting&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;If the dataset is too small or overly specific, the model may overfit, memorizing data patterns instead of generalizing knowledge.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(2) Cost Dependencies&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Fine-tuning is more efficient than training from scratch but still requires computational resources and time—especially for large models.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(3) Data Bias&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;If the fine-tuning dataset contains biases, the model can inherit or amplify those biases.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;6. One-Line Summary&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Fine-tuning customizes pre-trained AI models for specific tasks, making them specialists in chosen domains, provided high-quality data and robust training are applied.&lt;/p&gt;
&lt;p&gt;You can find some more details in the great paper &quot;&lt;a href=&quot;https://arxiv.org/html/2408.13296v1&quot;&gt;The Ultimate Guide to Fine-Tuning LLMs from Basics to Breakthroughs&lt;/a&gt;&quot;.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;Final Thoughts&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Want an AI that writes professional contracts, generates medical reports, or offers personalized insights? Fine-tuning is how you teach a generalist AI to become an expert. But remember, it’s only as good as the data and training it receives.&lt;/p&gt;
&lt;p&gt;That’s it for today’s AI deep dive! Follow for more, and let’s keep exploring the endless possibilities of AI together. See you tomorrow!&lt;/p&gt;
&lt;p&gt;Ps. feel free to check out my other &lt;a href=&quot;http://localhost:4321/tags/daily-ai-insights&quot;&gt;AI Daily Insights posts at here&lt;/a&gt;.&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>What Is an Embedding? The Bridge From Text to the World of Numbers</title><link>https://geekcoding101.com/posts/what-is-an-embedding-the-bridge-from-text-to-the-world-of-numbers</link><guid isPermaLink="true">https://geekcoding101.com/posts/what-is-an-embedding-the-bridge-from-text-to-the-world-of-numbers</guid><pubDate>Mon, 09 Dec 2024 00:00:00 GMT</pubDate><content:encoded>&lt;h3&gt;&lt;strong&gt;1. What Is an Embedding?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;An &lt;strong&gt;embedding&lt;/strong&gt; is the “translator” that converts language into numbers, enabling AI models to understand and process human language. AI doesn’t comprehend words, sentences, or syntax—it only works with numbers. Embeddings assign a unique numerical representation (a vector) to words, phrases, or sentences.&lt;/p&gt;
&lt;p&gt;Think of an embedding as a &lt;strong&gt;language map&lt;/strong&gt;: each word is a point on the map, and its position reflects its relationship with other words. For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;“cat” and “dog” might be close together on the map, while “cat” and “car” are far apart.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;2. Why Do We Need Embeddings?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Human language is rich and abstract, but AI models need to translate it into something mathematical to work with. Embeddings solve several key challenges:&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(1) Vectorizing Language&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Words are converted into vectors (lists of numbers). For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;“cat” → &lt;code&gt;[0.1, 0.3, 0.5]&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;“dog” → &lt;code&gt;[0.1, 0.32, 0.51]&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These vectors make it possible for models to perform mathematical operations like comparing, clustering, or predicting relationships.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(2) Capturing Semantic Relationships&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;The true power of embeddings lies in capturing &lt;strong&gt;semantic relationships&lt;/strong&gt; between words. For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;“king - man + woman ≈ queen” This demonstrates how embeddings encode complex relationships in a numerical format.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;(3) Addressing Data Sparsity&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Instead of assigning a unique index to every word (which can lead to sparse data), embeddings compress language into a limited number of dimensions (e.g., 100 or 300), making computations much more efficient.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;3. How Are Embeddings Created?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Embeddings are generated through machine learning models trained on large datasets. Here are some popular methods:&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(1) Word2Vec&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;One of the earliest and most successful embedding methods, Word2Vec is based on the idea that &lt;strong&gt;similar words appear in similar contexts&lt;/strong&gt;. For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Sentences: “Cats love milk” and “Dogs love bones”&lt;/li&gt;
&lt;li&gt;Word2Vec places “cats” and “dogs” close together because they share similar linguistic surroundings.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;(2) GloVe&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;GloVe (Global Vectors for Word Representation) focuses on capturing statistical co-occurrence. For instance:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Words like “apple” and “orange” often co-occur with “fruit,” and this relationship is encoded in their embeddings.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;(3) Transformer Models (e.g., BERT, GPT)&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Modern models dynamically create embeddings based on context. For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The word “bank” in “river bank” and “money bank” will have different embeddings, allowing the model to disambiguate meanings.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;4. Applications of Embeddings&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Embeddings are foundational to many AI applications, including:&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(1) Search Engines&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;By converting queries and documents into embeddings, search engines calculate their similarity (e.g., using dot products) to deliver the most relevant results.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(2) Recommendation Systems&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Platforms like YouTube and Netflix use embeddings to represent user preferences and content. For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Movies are embedded as vectors, and the system recommends content based on vector similarity.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;(3) Generative AI&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Embeddings enable models like ChatGPT or DALL-E to process and generate coherent text, images, and more.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;5. How Dot Products Relate to Embeddings&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Embeddings frequently involve &lt;strong&gt;dot product calculations&lt;/strong&gt;, a crucial mathematical operation for comparing vectors. Here’s where dot products come into play:&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(1) Similarity Measurement&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;In recommendation systems or search engines, the dot product measures the similarity between two vectors:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;If the dot product is high, the items (e.g., a query and a document) are similar.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;(2) Attention Mechanism&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;In Transformer models, dot products are used to compute attention scores, determining which parts of an input sequence are most relevant to the task.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;6. Challenges of Embeddings&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Despite their power, embeddings face some limitations:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Data Dependency&lt;/strong&gt;: Embedding quality depends heavily on training data. Biased data can result in biased embeddings.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Dimensional Trade-Offs&lt;/strong&gt;: High-dimensional embeddings are computationally expensive, while low-dimensional ones may lose critical information.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Semantic Ambiguity&lt;/strong&gt;: Even advanced embeddings struggle with capturing nuanced or metaphorical meanings.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;7. Visualization Resources&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;To better understand embeddings, here are some types of visualizations you can explore online:&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;Embedding Space&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Search&lt;/strong&gt;: &lt;code&gt;embedding space visualization&lt;/code&gt; or &lt;code&gt;word embedding map&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;These diagrams illustrate how words are distributed in a 2D or 3D space, showing semantic relationships.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;Dot Product Similarity&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Search&lt;/strong&gt;: &lt;code&gt;dot product similarity visualization&lt;/code&gt; or &lt;code&gt;cosine similarity embedding&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Demonstrates how embeddings are compared mathematically.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;Attention Mechanisms&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Search&lt;/strong&gt;: &lt;code&gt;transformer attention scores&lt;/code&gt; or &lt;code&gt;attention mechanism visualization&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Explains how embeddings and dot products work together in Transformers.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;8. One-Line Summary&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Embeddings are the bridge between human language and machine understanding, enabling AI models to map linguistic relationships into a mathematical space.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;Final Thoughts&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Actually, I feel like this time I didn’t delve deeply into embeddings. There’s just so much math involved, especially the dot product calculation of vectors. For those who want to learn more, I recommend checking out this article &lt;a href=&quot;https://ai.gopubby.com/an-intuitive-101-guide-to-vector-embeddings-ffde295c3558&quot;&gt;an-intuitive-101-guide-to-vector-embeddings&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Embeddings might seem like a dry technical concept, but they’re the unsung heroes behind AI’s ability to generate text, recommend content, and more. Next time you use ChatGPT, think about how every word you type has been transformed into a dense vector representation. Behind the magic is a lot of math!&lt;/p&gt;
&lt;p&gt;Let’s keep breaking down AI one piece at a time—follow for more insights, and see you tomorrow!&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;:::info&lt;/p&gt;
&lt;p&gt;Wow! Today marks the seventh issue of my &quot;&lt;a href=&quot;http://localhost:4321/tags/daily-ai-insights&quot;&gt;Daily AI Insight Series&lt;/a&gt;&quot;—a full week of consistent daily posts! Through this journey, I&apos;ve gained so much attention and grown a lot.&lt;/p&gt;
&lt;p&gt;Thank you all for your encouragement!&lt;/p&gt;
&lt;p&gt;Let’s keep it up! It’s Sunday today, and I originally thought about skipping it... but let’s push forward! Keep going, keep going!&lt;/p&gt;
&lt;p&gt;:::&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Pretraining vs. Fine-Tuning: What&apos;s the Difference?</title><link>https://geekcoding101.com/posts/pretraining-vs-fine-tuning-whats-the-difference</link><guid isPermaLink="true">https://geekcoding101.com/posts/pretraining-vs-fine-tuning-whats-the-difference</guid><pubDate>Wed, 11 Dec 2024 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Let&apos;s deep dive into &lt;a href=&quot;https://www.reddit.com/r/learnmachinelearning/comments/19f04y3/what_is_the_difference_between_pretraining/&quot;&gt;pretraining and fine-tuning&lt;/a&gt; today!&lt;/p&gt;
&lt;h3&gt;&lt;strong&gt;1. What Is Pretraining?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;img src=&quot;./pretraining.webp&quot; alt=&quot;pretraining&quot; title=&quot;pretraining&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Pretraining&lt;/strong&gt; is the first step in building AI models. Its goal is to equip the model with general language knowledge. Think of pretraining as “elementary school” for AI, where it learns how to read, understand, and process language using &lt;strong&gt;large-scale general datasets&lt;/strong&gt; (like Wikipedia, books, and news articles). During this phase, the model learns sentence structure, grammar rules, common word relationships, and more.&lt;/p&gt;
&lt;p&gt;For example, pretraining tasks might include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Masked Language Modeling (MLM):&lt;/strong&gt; Input: “John loves ___ and basketball.” The model predicts: “football.”&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Causal Language Modeling (CLM):&lt;/strong&gt; Input: “The weather is great, I want to go to” The model predicts: “the park.”&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Through this process, the model develops a foundational understanding of language.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;2. What Is Fine-Tuning?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Fine-tuning&lt;/strong&gt; builds on top of a pretrained model by training it on task-specific data to specialize in a particular area. Think of it as “college” for AI—it narrows the focus and develops expertise in specific domains. It uses &lt;strong&gt;smaller, targeted datasets&lt;/strong&gt; to optimize the model for specialized tasks (e.g., sentiment analysis, medical diagnosis, or legal document drafting).&lt;/p&gt;
&lt;p&gt;For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;To fine-tune a model for legal document generation, you would train it on a dataset of contracts and legal texts.&lt;/li&gt;
&lt;li&gt;To fine-tune a model for customer service, you would use your company’s FAQ logs.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Fine-tuning enables AI to excel at specific tasks without needing to start from scratch.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;3. Key Differences Between Pretraining and Fine-Tuning&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;While both processes aim to improve AI’s capabilities, they differ fundamentally in purpose and execution:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;strong&gt;Aspect&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Pretraining&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Fine-Tuning&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Purpose&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;To learn general language knowledge, including vocabulary, syntax, and semantic relationships.&lt;/td&gt;
&lt;td&gt;To adapt the model for specific tasks or domains.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Data Source&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Large-scale, general datasets (e.g., Wikipedia, books, news).&lt;/td&gt;
&lt;td&gt;Domain-specific, smaller datasets (e.g., medical records, legal texts, customer FAQs).&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Time and Cost&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Time-consuming and computationally expensive, requiring extensive GPU/TPU resources.&lt;/td&gt;
&lt;td&gt;Quicker and less resource-intensive.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Use Case&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Provides foundational language capabilities for a wide range of applications.&lt;/td&gt;
&lt;td&gt;Enables custom applications, like translation, sentiment analysis, or text summarization.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Parameter Update&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Fully updates model parameters from scratch.&lt;/td&gt;
&lt;td&gt;Makes targeted adjustments to the pretrained model’s parameters.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;In short:&lt;/strong&gt; Pretraining builds the “brain” of AI, while fine-tuning teaches it specific “skills.”&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;4. Why Separate Pretraining and Fine-Tuning?&lt;/strong&gt;&lt;/h3&gt;
&lt;h4&gt;&lt;strong&gt;(1) Efficiency&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Pretraining requires vast amounts of data and computational power. For instance, GPT-3’s pretraining cost millions of dollars in GPU time. Fine-tuning, however, can achieve impressive results with a smaller dataset and less computational effort.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(2) General vs. Specific Knowledge&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Pretrained models are designed for general-purpose tasks, while fine-tuning tailors these models for specific use cases, expanding their utility.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(3) Reusability&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;A single pretrained model can be fine-tuned for various domains, such as legal, medical, or educational applications. This modularity reduces redundancy and speeds up AI development.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;5. Real-Life Applications&lt;/strong&gt;&lt;/h3&gt;
&lt;h4&gt;&lt;strong&gt;Case 1: Chatbots&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Pretrained model:&lt;/strong&gt; Understands general conversational language, like greetings or small talk.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Fine-tuned model:&lt;/strong&gt; Learns how to answer domain-specific questions, such as those related to product returns.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;Case 2: Legal Document Generation&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Pretrained model:&lt;/strong&gt; Recognizes general language patterns and logical structures.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Fine-tuned model:&lt;/strong&gt; Can generate contracts and ensure legal compliance using domain-specific datasets.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;Case 3: Medical Diagnosis&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Pretrained model:&lt;/strong&gt; Understands basic language and context relationships.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Fine-tuned model:&lt;/strong&gt; Analyzes medical records and generates insights specific to healthcare.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;6. The Future&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;As AI models grow in size and capability (e.g., GPT-4 and beyond), techniques like &lt;strong&gt;Few-Shot Learning&lt;/strong&gt; and &lt;strong&gt;Zero-Shot Learning&lt;/strong&gt; are reducing dependence on fine-tuning for some tasks. However, for highly specialized use cases, fine-tuning remains indispensable.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Trends to watch:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Stronger pretrained models:&lt;/strong&gt; Increasingly capable of handling a broader range of tasks out of the box.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Simplified fine-tuning tools:&lt;/strong&gt; Making it easier for businesses and individuals to customize AI models.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;7. One-Line Summary&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Pretraining is the “basic education” that equips AI with foundational knowledge, while fine-tuning is the “advanced training” that makes it an expert in specific domains. Together, they are the backbone of modern AI capabilities.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;Final Thoughts&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Next time you interact with AI, remember the two phases behind its smarts: Pretraining to make it a “generalist” and fine-tuning to transform it into a “specialist.” Stay tuned for more AI insights tomorrow—follow me to explore the magic of AI!&lt;/p&gt;
&lt;p&gt;At the end, feel free to check out my other &lt;a href=&quot;http://localhost:4321/tags/daily-ai-insights&quot;&gt;AI Insights blog posts at here&lt;/a&gt;.&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Empower Your AI Journey: Foundation Models Explained</title><link>https://geekcoding101.com/posts/empower-your-ai-journey-foundation-models-explained</link><guid isPermaLink="true">https://geekcoding101.com/posts/empower-your-ai-journey-foundation-models-explained</guid><pubDate>Thu, 12 Dec 2024 00:00:00 GMT</pubDate><content:encoded>&lt;h3&gt;&lt;strong&gt;Introduction: Why It Matters&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;In the rapidly evolving field of AI, the distinction between &lt;strong&gt;foundation models&lt;/strong&gt; and &lt;strong&gt;task models&lt;/strong&gt; is critical for understanding how modern AI systems work. Foundation models, like GPT-4 or BERT, provide the backbone of AI development, offering general-purpose capabilities. Task models, on the other hand, are fine-tuned or custom-built for specific applications. Understanding their differences helps businesses and developers leverage the right model for the right task, optimizing both performance and cost. Let’s dive into how these two types of models differ and why both are essential.&lt;/p&gt;
&lt;p&gt;:::info&lt;/p&gt;
&lt;p&gt;Today&apos;s topic is similar to &lt;a href=&quot;/posts/pretraining-vs-fine-tuning-whats-the-difference&quot;&gt;Pretraining vs. Fine-Tuning&lt;/a&gt;. While &lt;strong&gt;&quot;Foundation Models vs. Task Models&quot;&lt;/strong&gt; and &lt;strong&gt;&quot;Pretraining vs. Fine-Tuning&quot;&lt;/strong&gt; are closely related, they’re not exactly the same. &lt;strong&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Foundation_model&quot;&gt;Foundation Models&lt;/a&gt; and Pretraining&lt;/strong&gt;: Foundation models are &lt;strong&gt;products of pretraining&lt;/strong&gt;. Task models are often derived from foundation models through &lt;strong&gt;fine-tuning&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;I put them separetely because people sometimes confused and separate them we can have a clear focus.&lt;/p&gt;
&lt;p&gt;:::&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;1. What Are Foundation Models?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;img src=&quot;./Foundation_models.png&quot; alt=&quot;Foundation models&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Foundation models are &lt;strong&gt;general-purpose AI models&lt;/strong&gt; trained on vast amounts of data to understand and generate language across a wide range of contexts. Their primary goal is to act as a &lt;strong&gt;universal knowledge base&lt;/strong&gt;, capable of supporting a multitude of applications with minimal additional training.&lt;/p&gt;
&lt;p&gt;Examples of foundation models include GPT-4, BERT, and PaLM. These models are not designed for any one task but are built to be flexible, with a deep understanding of grammar, structure, and semantics.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;Key Features:&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Massive Scale&lt;/strong&gt;: Often involve billions or even trillions of parameters (What does parameters mean? You can refer to my previous blog &lt;a href=&quot;/posts/what-are-parameters-why-are-bigger-models-often-smarter&quot;&gt;What Are Parameters?&lt;/a&gt;).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Multi-Purpose&lt;/strong&gt;: Can be adapted for numerous tasks through fine-tuning or prompt engineering (Please refer to my previous blog &lt;a href=&quot;/posts/what-is-prompt-engineering-and-how-to-train-ai-with-a-single-sentence&quot;&gt;What Is Prompt Engineering&lt;/a&gt; and &lt;a href=&quot;/daily-ai-insights/what-is-fine-tuning-how-to-teach-ai-specific-skills/&quot;&gt;What Is Fine-Tuning&lt;/a&gt;).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Pretraining-Driven&lt;/strong&gt;: Trained on vast datasets (e.g., Wikipedia, news, books) to understand general language structures (Please refer to ).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Think of a foundation model as a &lt;strong&gt;jack-of-all-trades&lt;/strong&gt;—broadly knowledgeable but not specialized in any one field.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;2. What Are Task Models?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Task models are &lt;strong&gt;specialized AI models&lt;/strong&gt; designed or fine-tuned to excel at a specific task, such as sentiment analysis, machine translation, or medical diagnostics. Unlike foundation models, task models are focused and purpose-built to meet particular goals.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;Key Features:&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Task-Specific&lt;/strong&gt;: Optimized for a narrow set of objectives.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Domain-Specific Data&lt;/strong&gt;: Trained on datasets tailored to the task, such as legal contracts, medical records, or customer reviews.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Lightweight and Deployable&lt;/strong&gt;: Typically smaller and easier to deploy in production settings.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For instance:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A sentiment analysis task model would determine whether a tweet is positive or negative.&lt;/li&gt;
&lt;li&gt;A medical diagnosis task model would analyze patient data and suggest potential conditions.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Task models are like &lt;strong&gt;specialists&lt;/strong&gt; in a particular domain—less versatile than foundation models but highly effective in their area of expertise.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;3. Core Differences Between Foundation Models and Task Models&lt;/strong&gt;&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;strong&gt;Aspect&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Foundation Models&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Task Models&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Purpose&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;General-purpose, suitable for multiple applications.&lt;/td&gt;
&lt;td&gt;Focused on a specific task or domain.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Data Source&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Large-scale, general datasets (e.g., Wikipedia, news).&lt;/td&gt;
&lt;td&gt;Domain-specific datasets (e.g., legal texts, reviews).&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Training Process&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Pretraining, requiring immense computational resources.&lt;/td&gt;
&lt;td&gt;Fine-tuning or custom training, requiring less computation.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Scale&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Billions or trillions of parameters.&lt;/td&gt;
&lt;td&gt;Smaller, optimized for production environments.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Flexibility&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Highly flexible, can adapt to various tasks.&lt;/td&gt;
&lt;td&gt;Limited to specific tasks, but highly accurate.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;In summary&lt;/strong&gt;: Foundation models are the &lt;strong&gt;base layer&lt;/strong&gt; of AI, while task models are tailored for specific applications.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;4. Why Do We Need Both?&lt;/strong&gt;&lt;/h3&gt;
&lt;h4&gt;&lt;strong&gt;(1) Foundation Models: Broad Utility&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Foundation models provide a starting point for diverse applications, saving time and resources. For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Use GPT-4 for general-purpose language understanding.&lt;/li&gt;
&lt;li&gt;Use BERT for natural language processing tasks like question answering or summarization.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;(2) Task Models: Precision and Efficiency&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Task models optimize performance for specific objectives. They are essential when accuracy and domain knowledge are critical. For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Fine-tune a foundation model to generate legally compliant contracts.&lt;/li&gt;
&lt;li&gt;Train a model specifically for medical imaging analysis.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;By combining foundation models with task models, developers can achieve both adaptability and precision.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;5. Real-Life Examples: Foundation Models + Task Models in Action&lt;/strong&gt;&lt;/h3&gt;
&lt;h4&gt;&lt;strong&gt;Example 1: Healthcare AI&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Foundation Model&lt;/strong&gt;: GPT-4 understands medical terminology.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Task Model&lt;/strong&gt;: Fine-tuned on clinical records to generate accurate diagnostic reports.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;Example 2: E-commerce Recommendations&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Foundation Model&lt;/strong&gt;: Analyzes general customer sentiment across reviews.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Task Model&lt;/strong&gt;: Customized to recommend products based on specific purchase behaviors.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;Example 3: Legal Document Automation&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Foundation Model&lt;/strong&gt;: Provides general language comprehension.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Task Model&lt;/strong&gt;: Generates legally compliant contracts with domain-specific training.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;6. The Future of Foundation and Task Models&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;As AI continues to evolve, the line between foundation models and task models may blur:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Foundation Models Will Become Stronger&lt;/strong&gt;: With advancements in pretraining, these models might handle specific tasks with little or no fine-tuning (e.g., few-shot learning or zero-shot learning).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Task Models Will Remain Relevant&lt;/strong&gt;: Despite stronger foundation models, specialized tasks requiring domain expertise and precision will still benefit from task-specific training.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The synergy between the two ensures that AI can adapt to both general and niche challenges.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;7. One-Line Summary&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Foundation models provide the broad, flexible foundation for AI, while task models deliver focused, specialized solutions tailored to specific needs.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;Final Thoughts&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Understanding the difference between foundation and task models is key to leveraging AI effectively. Whether building a general-purpose tool or solving a domain-specific problem, knowing when to rely on a foundation model and when to train a task model is critical. Stay tuned for more insights tomorrow—follow me for daily AI explorations!&lt;/p&gt;
&lt;p&gt;At the end, feel free to check out my other &lt;a href=&quot;http://localhost:4321/tags/daily-ai-insights&quot;&gt;AI Insights blog posts at here&lt;/a&gt;.&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Discover the Power of Zero-Shot and Few-Shot Learning</title><link>https://geekcoding101.com/posts/discover-the-power-of-zero-shot-and-few-shot-learning</link><guid isPermaLink="true">https://geekcoding101.com/posts/discover-the-power-of-zero-shot-and-few-shot-learning</guid><pubDate>Fri, 13 Dec 2024 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Transfer learning has revolutionized the way AI models adapt to new tasks, enabling them to generalize knowledge across domains. At its core, transfer learning allows models trained on vast datasets to tackle entirely new challenges with minimal additional data or effort. Two groundbreaking techniques within this framework are &lt;a href=&quot;https://en.wikipedia.org/wiki/Zero-shot_learning&quot;&gt;&lt;strong&gt;Zero-Shot Learning (ZSL)&lt;/strong&gt;&lt;/a&gt; and &lt;a href=&quot;https://www.ibm.com/think/topics/few-shot-learning#:~:text=Few%2Dshot%20learning%20is%20a,suitable%20training%20data%20is%20scarce.&quot;&gt;&lt;strong&gt;Few-Shot Learning (FSL)&lt;/strong&gt;&lt;/a&gt;. ZSL empowers AI to perform tasks without ever seeing labeled examples, while FSL leverages just a handful of examples to quickly master new objectives. These approaches highlight the versatility and efficiency of transfer learning, making it a cornerstone of modern AI applications. Let’s dive deeper into how ZSL and FSL work and why they’re transforming the landscape of machine learning.&lt;/p&gt;
&lt;h3&gt;&lt;strong&gt;1. What Is Zero-Shot Learning (ZSL)?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;[caption id=&quot;attachment_4838&quot; align=&quot;alignnone&quot; width=&quot;823&quot;]&lt;img src=&quot;./zero-shot-learning.webp&quot; alt=&quot;zero-shot learning&quot; /&gt; &lt;strong&gt;Zero-Shot Learning&lt;/strong&gt; refers to an AI model&apos;s ability to perform a specific task without having seen any labeled examples for that task during training. In other words, the model relies on its general knowledge and contextual understanding rather than on task-specific training data.[/caption]&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;Simple Example&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Imagine a model trained to recognize “cats” and “dogs,” but it has never seen a “tiger.” When you show it a tiger and ask, “Is this a tiger?” it can infer that it’s likely a tiger by reasoning based on the similarities and differences between cats, dogs, and tigers.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;How It Works&lt;/strong&gt;&lt;/h4&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Semantic Embeddings&lt;/strong&gt; ZSL maps both task descriptions and data samples into a shared semantic space. For instance, the word “tiger” is embedded as a vector, and the model compares it with the image’s vector to infer their relationship.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Pretrained Models&lt;/strong&gt; ZSL relies heavily on large foundation models like GPT-4 or CLIP, which have learned extensive general knowledge during pretraining. These models can interpret natural language prompts and infer the answer.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Natural Language Descriptions&lt;/strong&gt; Clear, descriptive prompts like “Is this a tiger?” help the model understand the task through language, allowing it to respond appropriately without requiring task-specific examples.&lt;/li&gt;
&lt;/ol&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;2. What Is Few-Shot Learning (FSL)?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Few-Shot Learning&lt;/strong&gt; refers to an AI model’s ability to complete a task after being exposed to only a few labeled examples (typically 1 to 10). It is particularly useful in scenarios where data is scarce.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;Simple Example&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Suppose you need to teach a model to distinguish between “apples” and “oranges.” By providing just five labeled images of each, the model can quickly learn how to classify new images into these two categories.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;How It Works&lt;/strong&gt;&lt;/h4&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;In-Context Learning&lt;/strong&gt; Few-Shot Learning leverages examples provided within the task context to help the model infer rules. For example:&lt;/p&gt;
&lt;p&gt;mathematica&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Examples of apples:&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Image 1: Red, round.&lt;/p&gt;
&lt;p&gt;Image 2: Green, round.&lt;/p&gt;
&lt;p&gt;Examples of oranges:&lt;/p&gt;
&lt;p&gt;Image 1: Orange, round.&lt;/p&gt;
&lt;p&gt;Image 2: Orange, slightly rough.&lt;/p&gt;
&lt;p&gt;Task: What category does this new image belong to?&lt;/p&gt;
&lt;p&gt;Image 3: Orange, round.&lt;/p&gt;
&lt;p&gt;The model uses the context to deduce the classification.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Parameter Transfer&lt;/strong&gt; FSL often relies on transferring knowledge from a pretrained model to a new task. The model applies its prior understanding of related tasks to the new one.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Gradient-Based Fine-Tuning&lt;/strong&gt; A small amount of fine-tuning with limited labeled data allows the model to adjust its parameters for better task performance.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;3. Key Differences Between ZSL and FSL&lt;/strong&gt;&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;strong&gt;Aspect&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Zero-Shot Learning (ZSL)&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Few-Shot Learning (FSL)&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Data Requirement&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No task-specific examples required.&lt;/td&gt;
&lt;td&gt;Requires a small number of labeled examples (1–10).&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Approach&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Relies on general knowledge and natural language prompts.&lt;/td&gt;
&lt;td&gt;Combines task examples with prior model knowledge.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Use Case&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Best for tasks with no available labeled data.&lt;/td&gt;
&lt;td&gt;Suitable for scenarios with limited labeled data.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Model Dependency&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Heavily depends on strong pretrained models.&lt;/td&gt;
&lt;td&gt;Requires pretrained models and task-specific adaptation.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;4. Real-World Applications&lt;/strong&gt;&lt;/h3&gt;
&lt;h4&gt;&lt;strong&gt;Zero-Shot Learning Applications&lt;/strong&gt;&lt;/h4&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Text Classification&lt;/strong&gt; Using GPT-4 to classify text as positive or negative sentiment without training on labeled data, relying solely on the prompt: “Is this a positive or negative review?”&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Image Recognition&lt;/strong&gt; CLIP can identify objects in images by answering natural language queries like “Is this a panda?” without having been trained on specific panda images.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;New Task Inference&lt;/strong&gt; Models like GPT-4 can handle tasks like translation between languages it hasn’t explicitly been trained on, leveraging its general language understanding.&lt;/li&gt;
&lt;/ol&gt;
&lt;h4&gt;&lt;strong&gt;Few-Shot Learning Applications&lt;/strong&gt;&lt;/h4&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Medical Diagnosis&lt;/strong&gt; Fine-tune a model with a few labeled medical records to diagnose rare diseases more accurately.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Niche Classification&lt;/strong&gt; Train a model to classify reviews in a specific industry (e.g., luxury goods) using only a handful of labeled examples.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Custom AI for Businesses&lt;/strong&gt; Fine-tune a model with a small dataset of customer support tickets to create a tailored AI assistant for answering specific queries.&lt;/li&gt;
&lt;/ol&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;5. Challenges of ZSL and FSL&lt;/strong&gt;&lt;/h3&gt;
&lt;h4&gt;&lt;strong&gt;Challenges of Zero-Shot Learning&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Understanding Task Descriptions&lt;/strong&gt; Models rely heavily on the clarity of natural language prompts, and vague instructions can lead to poor performance.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Domain Adaptation&lt;/strong&gt; Pretrained models may lack domain-specific knowledge (e.g., medical or legal), limiting their effectiveness in specialized areas.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;Challenges of Few-Shot Learning&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Sample Bias&lt;/strong&gt; A small dataset may not represent the full complexity of the task, leading to overfitting.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;High Data Quality Requirement&lt;/strong&gt; FSL demands clean, high-quality examples, as errors in the data can mislead the model.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;6. One-Line Summary&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Zero-shot learning enables models to infer tasks without any labeled data, while few-shot learning allows them to adapt quickly with just a few examples. Together, they make AI more flexible and efficient.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;Final Thoughts&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;ZSL and FSL represent AI’s shift toward greater adaptability and efficiency, enabling it to perform tasks with minimal data. Whether you’re marveling at GPT-4’s zero-shot conversational skills or fine-tuning a few-shot model for a specific use case, these techniques are revolutionizing AI applications. Stay tuned for tomorrow’s topic, and follow for more AI insights!&lt;/p&gt;
&lt;p&gt;At the end, feel free to check out my other &lt;a href=&quot;http://localhost:4321/tags/daily-ai-insights&quot;&gt;AI Insights blog posts at here&lt;/a&gt;.&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>The Hallucination Problem in Generative AI: Why Do Models “Make Things Up”?</title><link>https://geekcoding101.com/posts/the-hallucination-problem-in-generative-ai-why-do-models-make-things-up</link><guid isPermaLink="true">https://geekcoding101.com/posts/the-hallucination-problem-in-generative-ai-why-do-models-make-things-up</guid><pubDate>Sun, 15 Dec 2024 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;[caption id=&quot;attachment_4837&quot; align=&quot;alignnone&quot; width=&quot;1440&quot;]&lt;img src=&quot;./ai-hallucination.jpeg&quot; alt=&quot;AI hallucination&quot; /&gt; Generative AI has taken the tech world by storm, revolutionizing how we interact with information and automation. But one pesky issue has left users both puzzled and amused—&lt;a href=&quot;https://en.wikipedia.org/wiki/Hallucination_(artificial_intelligence)&quot;&gt;the “hallucination” problem&lt;/a&gt;. These hallucinations occur when AI models confidently produce incorrect or entirely fabricated content. Why does this happen, and how can we address it? Let’s explore.[/caption]&lt;/p&gt;
&lt;h2&gt;What Is Hallucination in Generative AI?&lt;/h2&gt;
&lt;p&gt;In generative AI, hallucination refers to instances where the model outputs false or misleading information that may sound credible at first glance. These outputs often result from the limitations of the AI itself and the data it was trained on.&lt;/p&gt;
&lt;h3&gt;Common Examples of AI Hallucinations&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Fabricating facts&lt;/strong&gt;: AI models might confidently state that “Leonardo da Vinci invented the internet,” mixing plausible context with outright falsehoods.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Wrong Quote&lt;/strong&gt;: &quot;Can you provide me with a source for the quote: &apos;The universe is under no obligation to make sense to you&apos;?&quot; &lt;strong&gt;AI Output&lt;/strong&gt;: &quot;This quote is from Albert Einstein in his book &lt;em&gt;The Theory of Relativity&lt;/em&gt;, published in 1921.&quot;  This quote is actually from Neil deGrasse Tyson, not Einstein. The AI associates the quote with a famous physicist and makes up a book to sound convincing.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Incorrect technical explanations&lt;/strong&gt;: AI might produce an elegant but fundamentally flawed description of blockchain technology, misleading both novices and experts alike.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hallucination highlights the gap between how AI &quot;understands&quot; data and how humans process information.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Why Do AI Models Hallucinate?&lt;/h2&gt;
&lt;p&gt;The hallucination problem isn’t a mere bug—it stems from inherent technical limitations and design choices in generative AI systems.&lt;/p&gt;
&lt;h3&gt;Biased and Noisy Training Data&lt;/h3&gt;
&lt;p&gt;Generative AI relies on massive datasets to learn patterns and relationships. However, these datasets often contain:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Biased information&lt;/strong&gt;: Common errors or misinterpretations in the data propagate through the model.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Incomplete data&lt;/strong&gt;: Missing critical context or examples in the training corpus leads to incorrect generalizations.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cultural idiosyncrasies&lt;/strong&gt;: Rare idiomatic expressions or language-specific nuances, like Chinese 成语, may be underrepresented in training data.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Limitations of Model Architecture&lt;/h3&gt;
&lt;p&gt;Generative AI predicts outputs based on probability rather than factual accuracy. Its core mechanism aims to find the &quot;most likely&quot; next word or phrase rather than verify its correctness. This design inherently prioritizes fluency over precision.&lt;/p&gt;
&lt;h3&gt;Influence of Prompts&lt;/h3&gt;
&lt;p&gt;The way users frame questions or inputs significantly affects AI responses. Ambiguity in prompts—common in languages like Chinese with complex grammar—can further exacerbate errors. For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Asking “What are China’s five tallest mountains?” may prompt a mix of correct and fabricated peaks due to poorly structured data or vague phrasing.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h2&gt;How Does Hallucination Impact Users?&lt;/h2&gt;
&lt;p&gt;The hallucination problem isn’t just an academic curiosity—it has real-world consequences that impact trust, decision-making, and user experience.&lt;/p&gt;
&lt;h3&gt;Misleading Decisions&lt;/h3&gt;
&lt;p&gt;When users unknowingly rely on incorrect AI outputs, the results can be detrimental:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Academic Missteps&lt;/strong&gt;: Students may reference false information in essays or research papers.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Business Risks&lt;/strong&gt;: Companies using AI for market analysis might make poor strategic decisions based on fabricated trends.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Challenges in Chinese Language Contexts&lt;/h3&gt;
&lt;p&gt;Chinese presents unique difficulties for AI systems, including:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Idioms and cultural references&lt;/strong&gt;: Misinterpreting or misusing idiomatic expressions can lead to miscommunication.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Ambiguity and polysemy&lt;/strong&gt;: Words with multiple meanings in Chinese can confuse AI and cause inaccurate translations or explanations.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Eroding Trust in AI&lt;/h3&gt;
&lt;p&gt;Frequent hallucinations can erode user confidence in generative AI, especially in high-stakes domains like healthcare, finance, or law. Once trust diminishes, adoption rates decline, stalling technological progress.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;How Can We Address the Hallucination Problem?&lt;/h2&gt;
&lt;p&gt;While hallucination cannot be entirely eliminated, there are practical steps to mitigate its effects.&lt;/p&gt;
&lt;h3&gt;Improve Training Data Quality&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Data cleaning&lt;/strong&gt;: Eliminate incorrect or low-quality information from training datasets.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Expand data diversity&lt;/strong&gt;: Incorporate underrepresented linguistic and cultural examples, such as idioms and colloquialisms.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Update for relevance&lt;/strong&gt;: Continuously supplement datasets with the latest verified information.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Implement Post-Processing Mechanisms&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Human review&lt;/strong&gt;: Deploy experts to validate AI-generated outputs in critical applications.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Algorithmic validation&lt;/strong&gt;: Use secondary AI models or rule-based systems to cross-check outputs for logical consistency.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Educate Users on AI Limitations&lt;/h3&gt;
&lt;p&gt;Empowering users with knowledge about AI&apos;s strengths and weaknesses fosters better usage. Teach users how to frame precise prompts and critically evaluate outputs rather than taking them at face value.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Future Outlook: Balancing Challenges and Opportunities&lt;/h2&gt;
&lt;p&gt;The hallucination problem underscores the limitations of even the most advanced generative AI systems. However, it also highlights areas for growth and innovation.&lt;/p&gt;
&lt;h3&gt;Can Hallucination Be Fully Eliminated?&lt;/h3&gt;
&lt;p&gt;Complete elimination of hallucinations seems unlikely due to the probabilistic nature of AI. However, ongoing improvements in training, validation, and architecture can significantly reduce the frequency and impact of hallucinations.&lt;/p&gt;
&lt;h3&gt;Best Practices for Coexisting with AI&lt;/h3&gt;
&lt;p&gt;The future lies in human-AI collaboration rather than blind reliance. By leveraging AI for what it excels at—pattern recognition, rapid response, and creativity—while compensating for its weaknesses, we can achieve a balanced coexistence.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Conclusion and Discussion&lt;/h2&gt;
&lt;p&gt;The hallucination problem in generative AI is a reminder that even cutting-edge technology is not infallible. What steps do you think are most effective for addressing this issue? Have you encountered amusing or frustrating examples of AI hallucinations? Share your thoughts and stories in the comments below!&lt;/p&gt;
&lt;p&gt;At the end, feel free to check out my other &lt;a href=&quot;http://localhost:4321/tags/daily-ai-insights&quot;&gt;AI Insights blog posts at here&lt;/a&gt;.&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Knowledge Distillation: How Big Models Train Smaller Ones</title><link>https://geekcoding101.com/posts/knowledge-distillation-how-big-models-train-smaller-ones</link><guid isPermaLink="true">https://geekcoding101.com/posts/knowledge-distillation-how-big-models-train-smaller-ones</guid><pubDate>Mon, 16 Dec 2024 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;strong&gt;Knowledge Distillation in AI&lt;/strong&gt; is a powerful method where large models (teacher models) transfer their knowledge to smaller, efficient models (student models). This technique enables AI to retain high performance while reducing computational costs, speeding up inference, and facilitating deployment on resource-constrained devices like mobile phones and edge systems. By mimicking the outputs of teacher models, student models deliver lightweight, optimized solutions ideal for real-world applications. Let’s explore how knowledge distillation works and why it’s transforming modern AI.&lt;/p&gt;
&lt;h3&gt;&lt;strong&gt;1. What Is Knowledge Distillation?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Knowledge distillation&lt;/strong&gt; is a technique where a &lt;strong&gt;large model (Teacher Model)&lt;/strong&gt; transfers its knowledge to a &lt;strong&gt;smaller model (Student Model)&lt;/strong&gt;. The goal is to compress the large model’s capabilities into a lightweight version that is faster, more efficient, and easier to deploy, while retaining high performance.&lt;/p&gt;
&lt;p&gt;Think of a teacher (large model) simplifying complex ideas for a student (small model). The teacher provides not just the answers but also insights into how the answers were derived, allowing the student to replicate the process efficiently.&lt;/p&gt;
&lt;p&gt;The illustration from  &lt;a href=&quot;https://link.zhihu.com/?target=https%3A//arxiv.org/abs/2006.05525%3Fref%3Dblog.roboflow.com&quot;&gt;Knowledge Distillation: A Survey&lt;/a&gt; explained it:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./knowledge-distillation-01.jpg&quot; alt=&quot;knowledge-distillation-01&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Another figure is from  &lt;a href=&quot;https://link.zhihu.com/?target=https%3A//arxiv.org/html/2402.13116v1&quot;&gt;A Survey on Knowledge Distillation of Large Language Models&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./knowledge-distillation-02.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;2. Why Is Knowledge Distillation Important?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Large models (e.g., GPT-4) are powerful but have significant limitations:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;High Computational Costs&lt;/strong&gt;: Require expensive hardware and energy to run.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Deployment Challenges&lt;/strong&gt;: Difficult to use on mobile devices or edge systems.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Slow Inference&lt;/strong&gt;: Unsuitable for real-time applications like voice assistants.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Knowledge distillation helps address these issues by:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Reducing Model Size&lt;/strong&gt;: Smaller models require fewer resources.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Improving Speed&lt;/strong&gt;: Faster inference makes them ideal for resource-constrained environments.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Maintaining Accuracy&lt;/strong&gt;: By learning from large models, smaller models can achieve comparable performance.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;3. How Does Knowledge Distillation Work?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;The process involves several key steps:&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(1) Train the Teacher Model&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;A large model is trained on a comprehensive dataset to achieve high accuracy and generalization.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(2) Generate Soft Targets&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;The teacher model produces outputs with detailed probability distributions.&lt;/li&gt;
&lt;li&gt;For example, when classifying an image, instead of just saying “cat,” the teacher might output:
&lt;ul&gt;
&lt;li&gt;Cat: 80%&lt;/li&gt;
&lt;li&gt;Dog: 15%&lt;/li&gt;
&lt;li&gt;Fox: 5%.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;These &lt;strong&gt;soft targets&lt;/strong&gt; provide rich information about how the teacher distinguishes between categories.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;(3) Train the Student Model&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;The smaller model learns from both the teacher’s soft targets and the original data.&lt;/li&gt;
&lt;li&gt;By mimicking the teacher’s outputs, the student absorbs the distilled knowledge without requiring as much capacity.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;(4) Evaluate and Optimize&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;The student model’s performance is validated and fine-tuned to ensure it meets the desired accuracy and efficiency.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;4. A Simple Example: The Classroom Analogy&lt;/strong&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Without Distillation&lt;/strong&gt;: A small model learns directly from raw data, like a student relying solely on a textbook without guidance.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;With Distillation&lt;/strong&gt;: The teacher (large model) explains not only the answers but also why certain conclusions are drawn. The student absorbs these nuanced insights, leading to better understanding.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Probabaly below figure from &lt;a href=&quot;https://towardsdatascience.com/knowledge-distillation-simplified-dd4973dbc764&quot;&gt;Knowledge Distillation : Simplified&lt;/a&gt; can help:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./knowledge-distillation-03.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;5. Real-World Applications of Knowledge Distillation&lt;/strong&gt;&lt;/h3&gt;
&lt;h4&gt;&lt;strong&gt;(1) Lightweight AI on Edge Devices&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;Small, distilled models are deployed on smartphones, IoT devices, and embedded systems.&lt;/li&gt;
&lt;li&gt;Example: A distilled CLIP model for image classification on mobile.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;(2) Real-Time Applications&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;Faster inference is crucial for speech recognition or recommendation systems in real-time scenarios.&lt;/li&gt;
&lt;li&gt;Example: Voice assistants using distilled models for quick responses.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;(3) Multitask Learning&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;Combine multiple teacher models into one small model capable of handling various tasks.&lt;/li&gt;
&lt;li&gt;Example: A single model for both translation and sentiment analysis.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;6. Challenges in Knowledge Distillation&lt;/strong&gt;&lt;/h3&gt;
&lt;h4&gt;&lt;strong&gt;(1) Knowledge Loss&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Small models may fail to replicate the full depth of understanding from large models, especially for complex tasks.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(2) Computational Overhead&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Generating soft targets from a teacher model can be resource-intensive when working with large datasets.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(3) Task-Specific Needs&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Different tasks require different knowledge. Adapting distilled models to specific tasks remains a research challenge.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;7. One-Line Summary&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Knowledge distillation compresses the “wisdom” of large models into smaller, efficient ones, enabling faster, cost-effective AI without sacrificing accuracy.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;Final Thoughts&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Knowledge distillation bridges the gap between large, powerful models and real-world deployment. By making AI both smarter and leaner, this technique is transforming applications from edge devices to real-time systems. Next time you use a quick AI assistant on your phone, think about the distilled knowledge powering it. Stay tuned for more insights tomorrow!&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Weight Initialization: Unleashing AI Performance Excellence</title><link>https://geekcoding101.com/posts/weight-initialization-unleashing-ai-performance-excellence</link><guid isPermaLink="true">https://geekcoding101.com/posts/weight-initialization-unleashing-ai-performance-excellence</guid><pubDate>Mon, 16 Dec 2024 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;img src=&quot;./weight-initialization.jpeg&quot; alt=&quot;a diagram of a weight initialization&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Weight Initialization in AI&lt;/strong&gt; plays a crucial role in ensuring effective neural network training. It determines the starting values for connections (weights) in a model, significantly influencing training speed, stability, and overall performance. Proper weight initialization prevents issues like vanishing or exploding gradients, accelerates convergence, and helps models achieve better results. Whether you’re working with Xavier, He, or orthogonal initialization, understanding these methods is essential for building high-performance AI systems.&lt;/p&gt;
&lt;p&gt;:::info&lt;/p&gt;
&lt;p&gt;Ugh, such a headache… sorry. Honestly, today’s chapter involves some formulas, and I feel like it’s tough to explain them clearly in such a limited space. But hey, it’s just a casual explainer piece, right? Hopefully, I can follow up with a deeper dive into the principles later on…&lt;/p&gt;
&lt;p&gt;:::&lt;/p&gt;
&lt;h3&gt;&lt;strong&gt;1. What Is Weight Initialization?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Weight initialization&lt;/strong&gt; is the process of assigning initial values to the weights of a neural network before training begins. These weights determine how neurons are connected and how much influence each connection has. While the values will be adjusted during training, their starting points can significantly impact the network’s ability to learn effectively.&lt;/p&gt;
&lt;p&gt;Think of weight initialization as choosing your &lt;strong&gt;starting point for a journey&lt;/strong&gt;.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A good starting point (proper initialization) puts you on the right path for a smooth trip.&lt;/li&gt;
&lt;li&gt;A bad starting point (poor initialization) may lead to delays, detours, or even getting lost altogether.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;2. Why Is Weight Initialization Important?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;The quality of weight initialization directly affects several key aspects of model training:&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(1) Training Speed&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;Poor initialization can slow down the model’s ability to learn by causing redundant or inefficient updates.&lt;/li&gt;
&lt;li&gt;Good initialization accelerates convergence, meaning the model learns faster.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;(2) Gradient Behavior&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Vanishing Gradients&lt;/strong&gt;: If weights are initialized too small, gradients shrink as they propagate backward, making it difficult for deeper layers to update.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Exploding Gradients&lt;/strong&gt;: If weights are initialized too large, gradients grow exponentially, leading to instability during training.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;(3) Final Model Performance&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;A well-initialized network is more likely to reach a better final solution, while a poorly initialized one may get stuck in a suboptimal solution or fail to train altogether.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;3. Everyday Examples of Weight Initialization&lt;/strong&gt;&lt;/h3&gt;
&lt;h4&gt;&lt;strong&gt;Example 1: The Zero Trap&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Imagine you’re training a neural network to distinguish between &quot;cats&quot; and &quot;dogs.&quot; If all weights are initialized to zero, every neuron in the network will compute the same value. The network will be incapable of learning diverse features like &quot;whiskers&quot; for cats or &quot;tail shapes&quot; for dogs. It’s like asking a group of people to vote, but everyone always gives the same answer—no progress can be made.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;Example 2: Random Chaos&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Suppose weights are initialized randomly but with values that are too large. The network becomes chaotic, like a classroom where everyone is shouting different answers at once. The gradients become uncontrollable, and learning collapses.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;Example 3: The Sweet Spot&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;With proper initialization (e.g., scaled random values), the network starts off on a stable footing. It’s like giving each voter clear instructions—everyone brings unique but manageable inputs to the table, allowing the group to reach a consensus effectively.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;4. Common Weight Initialization Methods&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Here are the most widely used approaches, explained without diving into technical formulas:&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;(1) Random Initialization&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;Assign random values to the weights.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Pro&lt;/strong&gt;: Breaks symmetry and ensures neurons don’t learn identical features.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Con&lt;/strong&gt;: If the range of randomness is too wide or narrow, training becomes unstable or slow.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;(2) Xavier Initialization&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;Designed to maintain balance in gradient flow across layers.&lt;/li&gt;
&lt;li&gt;I found &lt;a href=&quot;https://365datascience.com/tutorials/machine-learning-tutorials/what-is-xavier-initialization/&quot;&gt;this article explained Xavier initialization&lt;/a&gt; very well, feel free to check out.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Best For&lt;/strong&gt;: Networks using smooth activation functions like Sigmoid or tanh.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Benefit&lt;/strong&gt;: Helps gradients propagate effectively without vanishing or exploding.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;(3) He Initialization&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;Specifically tailored for ReLU activation functions.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why It Works&lt;/strong&gt;: ReLU only activates positive inputs, so it needs a larger initial range to ensure more neurons are active during training.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Best For&lt;/strong&gt;: Deep networks with ReLU or its variants.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;(4) Orthogonal Initialization&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;Starts with weights that form an orthogonal matrix.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Pro&lt;/strong&gt;: Ensures independence between different directions in the weight space.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Best For&lt;/strong&gt;: Complex or very deep networks.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;5. Practical Challenges and Optimizations&lt;/strong&gt;&lt;/h3&gt;
&lt;h4&gt;&lt;strong&gt;Challenges&lt;/strong&gt;&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Dynamic Needs&lt;/strong&gt;: Different network architectures and activation functions require tailored initialization methods. A one-size-fits-all approach rarely works.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Deep Networks&lt;/strong&gt;: In extremely deep networks, even good initialization methods may struggle to maintain stable gradients.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;Optimizations&lt;/strong&gt;&lt;/h4&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Activation Function Pairing&lt;/strong&gt;: Match initialization methods with the activation function. For example, He initialization works well with ReLU.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Normalization Layers&lt;/strong&gt;: Techniques like Batch Normalization or Layer Normalization can mitigate the effects of poor initialization.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Manual Fine-Tuning&lt;/strong&gt;: In some cases, experimenting with the initialization range for specific layers can yield better results.&lt;/li&gt;
&lt;/ol&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;6. One-Line Summary&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Weight initialization is the starting point for a neural network’s training journey, and proper initialization ensures the model learns efficiently, avoids gradient issues, and achieves better performance.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;Final Thoughts&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Weight initialization might seem like a small step in the deep learning pipeline, but it’s a critical factor for training success. The next time you train a neural network, pay close attention to your initialization strategy—it could make or break your model’s performance. Stay tuned for more AI insights, and let’s continue exploring together!&lt;/p&gt;
&lt;p&gt;At the end, please feel free to check out my other &lt;a href=&quot;http://localhost:4321/tags/daily-ai-insights&quot;&gt;AI Insights blog posts at here&lt;/a&gt;.&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Quantization: How to Unlock Incredible Efficiency on AI Models</title><link>https://geekcoding101.com/posts/quantization-how-to-unlock-incredible-efficiency-on-ai-models</link><guid isPermaLink="true">https://geekcoding101.com/posts/quantization-how-to-unlock-incredible-efficiency-on-ai-models</guid><pubDate>Wed, 18 Dec 2024 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;img src=&quot;./genai-quantization01.png&quot; alt=&quot;Quantization illustation&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Quantization is a transformative AI optimization technique that compresses models by reducing precision from high-bit floating-point numbers (e.g., FP32) to low-bit integers (e.g., INT8). This process significantly decreases storage requirements, speeds up inference, and enables deployment on resource-constrained devices like mobile phones or IoT systems—all while retaining close-to-original performance. Let’s explore why it is essential, how it works, and its real-world applications.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;Why Do AI Models Need to Be Slimmed Down?&lt;/h3&gt;
&lt;p&gt;AI models are growing exponentially in size, with models like GPT-4 containing hundreds of billions of parameters. While their performance is impressive, this scale brings challenges:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;High Computational Costs&lt;/strong&gt;: Large models require expensive hardware like GPUs or TPUs, with significant power consumption.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Slow Inference Speed&lt;/strong&gt;: Real-time applications, such as voice assistants or autonomous driving, demand fast responses that large models struggle to provide.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Deployment Constraints&lt;/strong&gt;: Limited memory and compute power on mobile or IoT devices make running large models impractical.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;The Problem&lt;/h4&gt;
&lt;p&gt;How can we preserve the capabilities of large models while making them lightweight and efficient?&lt;/p&gt;
&lt;h4&gt;The Solution&lt;/h4&gt;
&lt;p&gt;&lt;strong&gt;Quantization.&lt;/strong&gt; This optimization method compresses models to improve efficiency without sacrificing much performance.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;What Is It?&lt;/h3&gt;
&lt;p&gt;It reduces the precision of AI model parameters (weights) and intermediate results (activations) from high-precision formats like FP32 to lower-precision formats like FP16 or INT8.&lt;/p&gt;
&lt;h4&gt;Simplified Analogy&lt;/h4&gt;
&lt;p&gt;It is like compressing an image:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Original Image (High Precision)&lt;/strong&gt;: High resolution, large file size, slow to load.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Compressed Image (Low Precision)&lt;/strong&gt;: Smaller file size with slightly lower quality but faster and more efficient.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h3&gt;How Does It Work?&lt;/h3&gt;
&lt;p&gt;The key is representing parameters and activations using fewer bits while minimizing performance loss. This involves two main steps:&lt;/p&gt;
&lt;h4&gt;1. Numerical Range Mapping&lt;/h4&gt;
&lt;p&gt;High-precision floating-point numbers are mapped to a smaller integer range.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;For example, a floating-point parameter ranging from [-2.0, 2.0] is mapped to integers in [0, 255].&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;2. Float-to-Integer Conversion&lt;/h4&gt;
&lt;p&gt;Using a &lt;strong&gt;scale factor&lt;/strong&gt;, floating-point values are converted to integers:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;-1.0 becomes 0.&lt;/li&gt;
&lt;li&gt;2.0 becomes 255.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Result&lt;/h4&gt;
&lt;p&gt;The model operates at a lower precision but retains the key information needed for accurate predictions.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;Core Processes and Methods&lt;/h3&gt;
&lt;h4&gt;1. Weight Quantization&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What It Does&lt;/strong&gt;: Converts model parameters from FP32 to INT8.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Effect&lt;/strong&gt;: Reduces storage requirements significantly but may introduce minor errors.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;2. Activation Quantization&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What It Does&lt;/strong&gt;: Quantizes intermediate computation results during inference.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Effect&lt;/strong&gt;: Further reduces compute demands but requires hardware support.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;3. Quantization-Aware Training (QAT)&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What It Does&lt;/strong&gt;: Simulates quantization during training so the model can adapt to low-precision calculations.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Effect&lt;/strong&gt;: Retains higher accuracy compared to post-training quantization.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;4. Dynamic Quantization&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What It Does&lt;/strong&gt;: Dynamically quantizes activations during inference while keeping weights in high precision.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Effect&lt;/strong&gt;: Suitable for real-time applications, offering flexibility in deployment.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Learn more from this article:&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://selek.tech/posts/static-vs-dynamic-quantization-in-machine-learning/&quot;&gt;Static vs Dynamic Quantization in Machine Learning&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;Real-World Applications&lt;/h3&gt;
&lt;h4&gt;1. Voice Assistants on Mobile Devices&lt;/h4&gt;
&lt;p&gt;Voice assistants require fast responses, but large models consume too much power. By quantizing a speech recognition model, it can run locally on phones, doubling response speed and reducing power consumption by 40%.&lt;/p&gt;
&lt;h4&gt;2. Image Classification on Edge Devices&lt;/h4&gt;
&lt;p&gt;Edge devices like security cameras need to process large volumes of real-time video data. Quantizing a ResNet model from FP32 to INT8 increases inference speed by 3x while reducing memory usage by 70%.&lt;/p&gt;
&lt;h4&gt;3. Real-Time Object Detection in Autonomous Vehicles&lt;/h4&gt;
&lt;p&gt;Autonomous vehicles require high-accuracy, low-latency object detection. Using quantization-aware training, models maintain precision while accelerating processing speeds, enabling faster responses to sudden obstacles.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;Limitations&lt;/h3&gt;
&lt;p&gt;Despite its benefits, it has some limitations:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Accuracy Loss&lt;/strong&gt;: Low precision can introduce quantization errors, which affect performance in high-accuracy tasks like medical diagnostics.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Hardware Dependency&lt;/strong&gt;: Efficient quantized operations require hardware that supports low-precision calculations, such as INT8-compatible devices.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Limited Scope&lt;/strong&gt;: Adapting quantized models to complex or multimodal tasks remains a challenge.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h3&gt;The Future&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Mixed Precision Computing&lt;/strong&gt;: Combining low-precision (e.g., INT8) and high-precision (e.g., FP16/FP32) operations to balance performance and accuracy.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Improved Quantization-Aware Training&lt;/strong&gt;: Enhancing training methods to automatically optimize weight distributions during quantization.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Specialized Hardware Support&lt;/strong&gt;: Designing chips optimized for ultra-low precision calculations (e.g., INT4, INT2) to further reduce energy consumption.&lt;/li&gt;
&lt;/ol&gt;
&lt;hr /&gt;
&lt;h3&gt;One-Line Summary&lt;/h3&gt;
&lt;p&gt;Quantization enables AI models to transition from “high precision” to “high efficiency,” making them lightweight yet powerful—an essential tool for modern AI.&lt;/p&gt;
&lt;p&gt;At the end, you&apos;re welcome to access my other &lt;a href=&quot;http://localhost:4321/tags/daily-ai-insights&quot;&gt;AI Insights blog posts at here&lt;/a&gt;.&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Ray Serve: The Versatile Assistant for Model Serving</title><link>https://geekcoding101.com/posts/ray-serve-the-versatile-assistant-for-model-serving</link><guid isPermaLink="true">https://geekcoding101.com/posts/ray-serve-the-versatile-assistant-for-model-serving</guid><pubDate>Fri, 20 Dec 2024 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;a href=&quot;https://docs.ray.io/en/latest/index.html&quot;&gt;Ray Serve&lt;/a&gt; is a cutting-edge model serving library built on the Ray framework, designed to simplify and scale AI model deployment. Whether you’re chaining models in sequence, running them in parallel, or dynamically routing requests, Ray Serve excels at handling complex, distributed inference pipelines. Unlike Ollama or FastAPI, it combines ease of use with powerful scaling, multi-model management, and Pythonic APIs. In this post, we’ll explore how Ray Serve compares to other solutions and why it stands out for large-scale, multi-node AI serving.&lt;/p&gt;
&lt;h3&gt;&lt;strong&gt;Before Introducing Ray Serve, We Need to Understand Ray&lt;/strong&gt;&lt;/h3&gt;
&lt;h4&gt;&lt;strong&gt;What is Ray?&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Ray is an open-source distributed computing framework that provides the core tools and components for building and running distributed applications. Its goal is to enable developers to easily scale single-machine programs to distributed environments, supporting high-performance tasks such as distributed model training, large-scale data processing, and distributed inference.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;Core Modules of Ray&lt;/strong&gt;&lt;/h4&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Ray Core&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;The foundation of Ray, providing distributed scheduling, task execution, and resource management.&lt;/li&gt;
&lt;li&gt;Allows Python functions to be seamlessly transformed into distributed tasks using the &lt;code&gt;@ray.remote&lt;/code&gt; decorator.&lt;/li&gt;
&lt;li&gt;Ideal for distributed data processing and computation-intensive workloads.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Ray Libraries&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;Built on top of Ray Core, these are specialized tools designed for specific tasks. Examples include:
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Ray Tune&lt;/strong&gt;: For hyperparameter search and experiment optimization.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Ray Train&lt;/strong&gt;: For distributed model training.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Ray Serve&lt;/strong&gt;: For distributed model serving.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Ray Data&lt;/strong&gt;: For large-scale data and stream processing.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;In simpler terms, &lt;strong&gt;Ray Core&lt;/strong&gt; is the underlying engine, while the various tools (like Ray Serve) are specific modules built on top of it to handle specific functionalities.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;Now Let’s Talk About Ray Serve...&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Many people ask: “Is Ray Serve just a backend service that routes user requests to an LLM (Large Language Model) and returns the results?”&lt;/p&gt;
&lt;p&gt;You’re half right! Ray Serve does exactly that, but it’s &lt;strong&gt;much more than just a “delivery boy.”&lt;/strong&gt; Compared to a basic FastAPI backend or a dedicated tool like Ollama, Ray Serve is a &lt;strong&gt;flexible, capable, and self-scaling assistant&lt;/strong&gt; that handles much more than just routing.&lt;/p&gt;
&lt;p&gt;Let’s dive in and break down what Ray Serve does, and how it compares to Ollama or a custom-built FastAPI solution.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;Ray Serve: The Versatile Multi-Tasker of Model Serving&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;In short, &lt;strong&gt;Ray Serve’s mission is to:&lt;/strong&gt; “Handle user requests, route them to the right model for processing, optimize resources, and dynamically scale as needed.”&lt;/p&gt;
&lt;p&gt;It’s like a &lt;strong&gt;supercharged scheduler&lt;/strong&gt; that performs the following key tasks:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Setting up model services&lt;/strong&gt;: You tell it where your model is (e.g., a GPT-4 instance), and it will automatically handle receiving requests, sending inference tasks, and even batching requests.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Managing traffic spikes&lt;/strong&gt;: When user requests flood in like a tidal wave, it dynamically scales instances to handle the pressure.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Supporting multiple models&lt;/strong&gt;: With Ray Serve, you can host multiple models in a single service (e.g., one for text generation and another for spam classification) without any issues.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So no, Ray Serve isn’t just “doing the grunt work”—it also adjusts the architecture, adds new resources, and patches itself when needed.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;Application Patterns of Ray Serve: Adapting to Multi-Model and Multi-Step Inference&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;:::info&lt;/p&gt;
&lt;p&gt;The diagrams in this sections are from https://www.anyscale.com/glossary/what-is-ray-serve.&lt;/p&gt;
&lt;p&gt;:::&lt;/p&gt;
&lt;p&gt;In modern AI systems, multi-model and multi-step inference has become a common requirement. Whether you’re processing images, text, or multi-modal inputs, model services need to support &lt;strong&gt;flexible inference patterns&lt;/strong&gt;. Ray Serve excels here by seamlessly adapting to the following three classic patterns, offering simple and efficient Pythonic APIs to minimize configuration complexity.&lt;/p&gt;
&lt;hr /&gt;
&lt;h4&gt;&lt;strong&gt;Pattern 1: Sequential Model Inference (Chaining Models in Sequence)&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;&lt;img src=&quot;./ray-serve-pattern1.jpg&quot; alt=&quot;Ray Serve Pattern 01&quot; /&gt;&lt;/p&gt;
&lt;p&gt;In this pattern, user input passes through multiple models sequentially, with each model’s output serving as the input for the next. This chained structure is common in tasks like image processing or data transformations.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt; For an image enhancement task, the input might go through a denoising model (Model_1), followed by a feature extraction model (Model_2), and finally a classification model (Model_3).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Advantages of Ray Serve:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Efficient Communication&lt;/strong&gt;: Data is passed between models using shared memory, reducing overhead.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Flexible Scheduling&lt;/strong&gt;: Resources are dynamically allocated to ensure stable and efficient inference pipelines.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h4&gt;&lt;strong&gt;Pattern 2: Parallel Model Inference (Ensembling Models)&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;&lt;img src=&quot;./ray-serve-pattern2.jpg&quot; alt=&quot;Ray Serve Pattern 02&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt; In this pattern, user input is sent to multiple models simultaneously, with each model processing the request independently. The results are then aggregated by an ensemble step to produce the final output. This pattern is often used in recommendation systems or ensemble learning, where outputs from multiple models are combined for decision-making.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt; A recommendation system might use collaborative filtering (Model_1), a deep learning model (Model_2), and a rule-based model (Model_3) to make predictions, then select the best recommendation based on business logic.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Advantages of Ray Serve:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Flexible Routing Mechanism&lt;/strong&gt;: Easily configure multiple model endpoints for parallel processing.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;High-Concurrency Handling&lt;/strong&gt;: Ray’s distributed architecture efficiently manages high-load scenarios with multiple models.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h4&gt;&lt;strong&gt;Pattern 3: Dynamic Model Dispatching (Dynamic Dispatching to Models)&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;&lt;img src=&quot;./ray-serve-pattern3.jpg&quot; alt=&quot;Ray Serve Pattern 03&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt; Here, models are dynamically selected based on the input’s characteristics, ensuring that only the necessary models are triggered for inference. This is ideal for scenarios with complex classification tasks or diverse model types.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt; In an image classification system, depending on the input image (e.g., a fruit, car, or plant), a specialized model is dynamically chosen for inference instead of invoking every model in the pipeline.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Advantages of Ray Serve:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Resource Efficiency&lt;/strong&gt;: Only the required models are triggered, avoiding unnecessary computation.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Flexible Business Logic&lt;/strong&gt;: Dynamic routing rules can be easily defined with simple Python code, eliminating the need for complex YAML configurations.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;Unique Advantages of Ray Serve&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Compared to other model-serving frameworks like &lt;strong&gt;TensorFlow Serving&lt;/strong&gt; or &lt;strong&gt;NVIDIA Triton&lt;/strong&gt;, Ray Serve offers unique advantages for multi-step and multi-model inference scenarios:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Dynamic Scheduling&lt;/strong&gt;: Adjust resources and routing strategies based on workload requirements.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Efficient Communication&lt;/strong&gt;: Optimize data transfer between models using shared memory to reduce overhead.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Granular Resource Allocation&lt;/strong&gt;: Assign fractional CPU or GPU resources to model instances, improving utilization.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Pythonic API&lt;/strong&gt;: Simplify implementation with intuitive Python interfaces, avoiding complex YAML setups.&lt;/li&gt;
&lt;/ol&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;Ollama vs. Ray Serve vs. Custom FastAPI: A Comparison&lt;/strong&gt;&lt;/h3&gt;
&lt;h4&gt;&lt;strong&gt;1. Ollama: The Lightweight Assistant for LLMs&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Ollama is designed to quickly set up local LLM services like LLaMA or other open-source models.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Strengths:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Plug-and-Play Simplicity&lt;/strong&gt;: Minimal configuration required.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;LLM-Focused&lt;/strong&gt;: Optimized for large language models with offline deployment support.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Weaknesses:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Limited Flexibility&lt;/strong&gt;: Restricted to LLMs and lacks support for multi-model management.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Scalability Concerns&lt;/strong&gt;: Not ideal for high-concurrency or distributed deployments.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h4&gt;&lt;strong&gt;2. Custom FastAPI: The DIY Player for Enthusiasts&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;FastAPI is a flexible web framework for building lightweight APIs, including ones that interface with backend models.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Strengths:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Full Customization&lt;/strong&gt;: You have complete control over the logic and routing.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Lightweight&lt;/strong&gt;: Ideal for small-scale projects.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Weaknesses:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Manual Scaling&lt;/strong&gt;: Requires hand-crafted solutions for scaling and multi-model management.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Complex Distributed Deployments&lt;/strong&gt;: Needs additional tools like Kubernetes for distributed setups.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h4&gt;&lt;strong&gt;3. Ray Serve: The Smart Manager&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Ray Serve combines Ollama’s simplicity with FastAPI’s flexibility, adding powerful distributed capabilities.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Strengths:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Multi-Model Support&lt;/strong&gt;: Host multiple models simultaneously.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Dynamic Scaling&lt;/strong&gt;: Automatically adjust resources based on traffic.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Distributed Deployment&lt;/strong&gt;: Handles multi-node clusters effortlessly.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Batching Optimization&lt;/strong&gt;: Combines multiple requests for efficient processing.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Weaknesses:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Learning Curve&lt;/strong&gt;: Configuration is more complex than FastAPI.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Ray Dependency&lt;/strong&gt;: May feel like overkill for single-node setups.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;Choosing the Right Tool&lt;/strong&gt;&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature/Framework&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Ray Serve&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Ollama&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;FastAPI&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Multi-Model Support&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Strong&lt;/td&gt;
&lt;td&gt;Moderate&lt;/td&gt;
&lt;td&gt;Weak&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Distributed Deployment&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Dynamic Scaling&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Learning Curve&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Moderate&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Best For&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Large-scale distributed projects, complex model serving&lt;/td&gt;
&lt;td&gt;Quick local LLM deployment&lt;/td&gt;
&lt;td&gt;Small projects, API customization&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;Conclusion&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;If you just want a quick, local LLM deployment, go with &lt;strong&gt;Ollama&lt;/strong&gt;. For flexible API development, &lt;strong&gt;FastAPI&lt;/strong&gt; is your best choice. If you need multi-model management, dynamic scaling, or distributed deployment, &lt;strong&gt;Ray Serve&lt;/strong&gt; is the ultimate solution.&lt;/p&gt;
&lt;p&gt;Ray Serve acts as the &quot;smart manager&quot; of backend services, effortlessly handling both single-node and multi-node deployments. Stay tuned for a deeper dive into how Ray Serve dynamically adjusts resources based on traffic!&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Groundbreaking News: OpenAI Unveils o3 and o3 Mini with Stunning ARC-AGI Performance</title><link>https://geekcoding101.com/posts/groundbreaking-news-openai-unveils-o3-and-o3-mini-with-stunning-arc-agi-performance</link><guid isPermaLink="true">https://geekcoding101.com/posts/groundbreaking-news-openai-unveils-o3-and-o3-mini-with-stunning-arc-agi-performance</guid><pubDate>Sat, 21 Dec 2024 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;On December 20, 2024, &lt;a href=&quot;https://openai.com/12-days/&quot;&gt;OpenAI concluded its 12-day &quot;OpenAI Christmas Gifts&quot; campaign&lt;/a&gt; by revealing two groundbreaking models: &lt;strong&gt;o3 and o3 mini&lt;/strong&gt;. At the same time, &lt;a href=&quot;https://arcprize.org/&quot;&gt;the ARC Prize organization&lt;/a&gt; announced OpenAI&apos;s remarkable performance on the ARC-AGI benchmark. The o3 system scored a &lt;strong&gt;breakthrough 75.7% on the Semi-Private Evaluation Set&lt;/strong&gt;, with a staggering &lt;strong&gt;87.5% in high-compute mode&lt;/strong&gt; (using 172x compute resources). This achievement marks an unprecedented leap in AI&apos;s ability to adapt to novel tasks, setting a new milestone in generative AI development.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./12days.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;The o3 Series: From Innovation to Breakthrough&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;img src=&quot;./day12-live.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;OpenAI CEO Sam Altman had hinted that this release would feature “big updates” and some “stocking stuffers.” The o3 series clearly falls into the former category. Both &lt;strong&gt;o3&lt;/strong&gt; and &lt;strong&gt;o3 mini&lt;/strong&gt; represent a pioneering step towards 2025, showcasing exceptional reasoning capabilities and redefining the possibilities of AI systems.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;ARC-AGI Performance: A Milestone Achievement for o3&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;The o3 system demonstrated its capabilities on the ARC-AGI benchmark, achieving &lt;strong&gt;75.7% in efficient mode&lt;/strong&gt; and &lt;strong&gt;87.5% in high-compute mode&lt;/strong&gt;. These scores represent a major leap in AI&apos;s ability to generalize and adapt to novel tasks, far surpassing previous generative AI models.&lt;/p&gt;
&lt;p&gt;[caption id=&quot;attachment_4806&quot; align=&quot;alignnone&quot; width=&quot;1750&quot;]&lt;img src=&quot;./scores-arc-agi.jpg&quot; alt=&quot;oai-o3-arc-agi-score&quot; /&gt; From https://arcprize.org/blog/oai-o3-pub-breakthrough[/caption]&lt;/p&gt;
&lt;hr /&gt;
&lt;h4&gt;&lt;strong&gt;What is ARC-AGI?&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;&lt;strong&gt;ARC-AGI&lt;/strong&gt; (AI Readiness Challenge for Artificial General Intelligence) is a benchmark specifically designed to test AI&apos;s adaptability and generalization. Its tasks are uniquely crafted:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Simple for humans&lt;/strong&gt;: Tasks like logical reasoning and problem-solving.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Challenging for AI&lt;/strong&gt;: Especially when models haven’t been explicitly trained on similar data.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;o3’s performance highlights a significant improvement in tackling new tasks, with its high-compute configuration setting a new standard at 87.5%.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;How o3 Outshines Traditional LLMs: From Memory to Program Synthesis&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Traditional GPT models rely on &quot;memorization&quot;: learning and executing predefined programs based on massive training data. However, this approach struggles with novel tasks due to its inability to dynamically recombine knowledge or generate new &quot;programs.&quot;&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;o3&apos;s Core Innovation: Dynamic Knowledge Recombination&lt;/strong&gt;&lt;/h4&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Program Search and Execution&lt;/strong&gt; o3 generates natural language &quot;programs&quot; (such as &lt;strong&gt;Chains of Thought, CoT&lt;/strong&gt;) to solve tasks and executes them internally.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Evaluation and Refinement&lt;/strong&gt; Using techniques similar to &lt;strong&gt;Monte-Carlo Tree Search (MCTS)&lt;/strong&gt;, o3 dynamically evaluates program paths and selects optimal solutions.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;While this process is compute-intensive (requiring millions of tokens and significant costs per task), it dramatically enhances AI’s adaptability to new challenges.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;Efficiency vs. Cost: Balancing o3’s Performance&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Despite its remarkable performance, o3’s high-compute mode comes with significant costs. According to ARC Prize data:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Efficient mode&lt;/strong&gt;:
&lt;ul&gt;
&lt;li&gt;Cost per task: ~$20&lt;/li&gt;
&lt;li&gt;Semi-Private Eval score: 75.7%&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;High-compute mode&lt;/strong&gt;:
&lt;ul&gt;
&lt;li&gt;Uses 172x resources of efficient mode.&lt;/li&gt;
&lt;li&gt;Achieves 87.5%, but with a much higher cost.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;While the current cost-performance ratio remains a challenge, advancements in optimization and hardware are expected to reduce costs in the coming months.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;What Makes o3 a Groundbreaking Leap?&lt;/strong&gt;&lt;/h3&gt;
&lt;h4&gt;&lt;strong&gt;1. Task Adaptability&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;o3 dynamically generates and executes task-specific natural language programs, moving beyond the static “memorization” paradigm of previous generative AI models.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;2. Generalization&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Compared to the GPT series, o3 demonstrates near-human generalization capabilities, especially on benchmarks like ARC-AGI.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;3. Architectural Innovation&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;o3’s success underscores the critical role of architecture in advancing AI capabilities. Simply scaling GPT-4 or similar models would not achieve comparable results.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;Is o3 AGI?&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;While o3’s performance is extraordinary, it has not yet reached the level of Artificial General Intelligence (AGI). Key limitations include:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Failures on Simple Tasks&lt;/strong&gt; Even in high-compute mode, o3 struggles with some straightforward tasks, revealing gaps in fundamental reasoning.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Challenges with ARC-AGI-2&lt;/strong&gt; Preliminary tests suggest that o3 might score below 30% on the upcoming ARC-AGI-2 benchmark, while average humans score over 95%.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;These challenges highlight that while o3 is a significant milestone, it remains a step on the path to true AGI.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;Looking Ahead: The Future of o3 and AGI&lt;/strong&gt;&lt;/h3&gt;
&lt;h4&gt;&lt;strong&gt;1. Open-Source Collaboration&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;The ARC Prize initiative plans to launch the more challenging ARC-AGI-2 benchmark in 2025, encouraging researchers to build on o3’s success through open-source analysis and optimization.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;2. Expanding Capabilities&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Further analysis of o3 will help identify its mechanisms, performance bottlenecks, and potential for future advancements.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;3. Advancing Benchmarks&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;The ARC Prize Foundation is developing third-generation benchmarks to push the boundaries of AI systems’ adaptability and generalization.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;Conclusion: The Significance of o3&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;OpenAI’s o3 model represents a groundbreaking leap in generative AI, pushing the boundaries of task adaptability and dynamic knowledge recombination. By overcoming the limitations of traditional LLMs, o3 opens new avenues for addressing novel challenges.&lt;/p&gt;
&lt;p&gt;This is only the beginning. With new benchmarks and collaborative research on the horizon, o3 sets the stage for further progress towards AGI. As we look ahead to 2025, the future of AI promises even greater possibilities.&lt;/p&gt;
&lt;p&gt;:::info&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Disclaimer:&lt;/strong&gt; The content above includes contributions generated with the assistance of AI tools.&lt;/p&gt;
&lt;p&gt;:::&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Diving into &quot;Attention is All You Need&quot;: My Transformer Journey Begins!</title><link>https://geekcoding101.com/posts/diving-into-attention-is-all-you-need-my-transformer-journey-begins</link><guid isPermaLink="true">https://geekcoding101.com/posts/diving-into-attention-is-all-you-need-my-transformer-journey-begins</guid><pubDate>Sat, 28 Dec 2024 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Today marks the beginning of my adventure into one of the most groundbreaking papers in AI for transformer: &lt;strong&gt;&quot;Attention is All You Need&quot;&lt;/strong&gt; by Vaswani et al. If you’ve ever been curious about how modern language models like GPT or BERT work, this is where it all started. It’s like diving into the DNA of &lt;strong&gt;transformers&lt;/strong&gt; — the core architecture behind many AI marvels today.&lt;/p&gt;
&lt;p&gt;What I’ve learned so far has completely blown my mind, so let’s break it down step by step. I’ll keep it fun, insightful, and bite-sized so you can learn alongside me! From today, I plan to study one or two pages of this paper daily and share my learning highlights right here.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;Day 1: The Abstract&lt;/h3&gt;
&lt;p&gt;The abstract of &lt;strong&gt;&quot;Attention is All You Need&quot;&lt;/strong&gt; sets the stage for the paper’s groundbreaking contributions. Here’s what I’ve uncovered today about the &lt;strong&gt;Transformer architecture&lt;/strong&gt;:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;The Problem with Traditional Models:&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;Most traditional sequence models rely on &lt;strong&gt;Recurrent Neural Networks (RNNs)&lt;/strong&gt; or &lt;strong&gt;Convolutional Neural Networks (CNNs)&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;These models have limitations:
&lt;ul&gt;
&lt;li&gt;RNNs are slow due to sequential processing and lack parallelization.&lt;/li&gt;
&lt;li&gt;CNNs struggle to capture long-range dependencies effectively.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Transformer’s Proposal:&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;The paper introduces the &lt;strong&gt;Transformer&lt;/strong&gt;, a new architecture that uses only &lt;strong&gt;Attention Mechanisms&lt;/strong&gt; while completely removing recurrence and convolution. This approach makes &lt;strong&gt;transformers&lt;/strong&gt; faster and more efficient.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Experimental Results:&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;On &lt;strong&gt;WMT 2014 English-German translation&lt;/strong&gt;, the Transformer achieves a BLEU score of 28.4, surpassing previous models by over 2 BLEU points. WMT (Workshop on Machine Translation) is a benchmark competition for translation models, and this task involves translating English text into German.&lt;/li&gt;
&lt;li&gt;On &lt;strong&gt;WMT 2014 English-French translation&lt;/strong&gt;, it achieves a state-of-the-art BLEU score of 41.8 with significantly lower training costs. This task involves translating English text into French.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;What is BLEU?&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;BLEU (Bilingual Evaluation Understudy) is a metric used to evaluate the quality of machine translations. It measures how closely the machine-generated translation matches human reference translations. Scores range from 0 to 100, with higher scores indicating better performance.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Generalization to Other Tasks:&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;Transformer model&lt;/strong&gt; is not just limited to translation. The paper demonstrates its effectiveness in &lt;strong&gt;English constituency parsing&lt;/strong&gt;, even with limited training data.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;hr /&gt;
&lt;h3&gt;Why Transformers Matter&lt;/h3&gt;
&lt;p&gt;Transformers are everywhere now. From powering tools like &lt;strong&gt;Google Translate&lt;/strong&gt; to enabling cutting-edge models like &lt;strong&gt;GPT&lt;/strong&gt;, the ideas in this paper are the foundation of modern AI. Learning about &lt;strong&gt;transformers&lt;/strong&gt; feels like discovering the blueprint of an advanced technology that’s reshaping the world.&lt;/p&gt;
&lt;p&gt;What’s next for me? Tomorrow, I’ll dive into the introduction and explore why &lt;strong&gt;attention mechanisms&lt;/strong&gt; are such a powerful concept within the Transformer architecture.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;Your Takeaway&lt;/h3&gt;
&lt;p&gt;If you’ve been putting off reading this paper, join me! It’s surprisingly approachable once you break it down into smaller concepts. Stay tuned for more updates on my journey, and let’s explore the world of &lt;strong&gt;transformers&lt;/strong&gt; together. Spoiler: it’s insanely cool!&lt;/p&gt;
&lt;p&gt;:::info&lt;/p&gt;
&lt;p&gt;I was struggling when to use &quot;Transformers&quot; or &quot;Transformer&quot;, here explanation came from ChatGPT:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Singular &lt;strong&gt;Transformer&lt;/strong&gt; is used correctly when talking about the architecture itself.&lt;/li&gt;
&lt;li&gt;Plural &lt;strong&gt;Transformers&lt;/strong&gt; is used correctly when discussing broader applications.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;:::&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;Stay curious, stay excited. Let the learning adventure begin! 🚀&lt;/p&gt;
&lt;p&gt;:::info&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Disclaimer:&lt;/strong&gt; The content above includes contributions generated with the assistance of AI tools.&lt;/p&gt;
&lt;p&gt;:::&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Terms Used in &quot;Attention is All You Need&quot;</title><link>https://geekcoding101.com/posts/terms-used-in-attention-is-all-you-need</link><guid isPermaLink="true">https://geekcoding101.com/posts/terms-used-in-attention-is-all-you-need</guid><pubDate>Sat, 28 Dec 2024 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;img src=&quot;./attention-is-all-you-need-term.png&quot; alt=&quot;attention is all you need term&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Below is a comprehensive table of key terms used in the paper &quot;Attention is All You Need,&quot; along with their English and Chinese translations. Where applicable, links to external resources are provided for further reading.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;strong&gt;English Term&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Chinese Translation&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Explanation&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Link&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Encoder&lt;/td&gt;
&lt;td&gt;编码器&lt;/td&gt;
&lt;td&gt;The component that processes input sequences.&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Decoder&lt;/td&gt;
&lt;td&gt;解码器&lt;/td&gt;
&lt;td&gt;The component that generates output sequences.&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Attention Mechanism&lt;/td&gt;
&lt;td&gt;注意力机制&lt;/td&gt;
&lt;td&gt;Measures relationships between sequence elements.&lt;/td&gt;
&lt;td&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Attention_mechanism&quot;&gt;Attention Mechanism Explained&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Self-Attention&lt;/td&gt;
&lt;td&gt;自注意力&lt;/td&gt;
&lt;td&gt;Focuses on dependencies within a single sequence.&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Masked Self-Attention&lt;/td&gt;
&lt;td&gt;掩码自注意力&lt;/td&gt;
&lt;td&gt;Prevents the decoder from seeing future tokens.&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multi-Head Attention&lt;/td&gt;
&lt;td&gt;多头注意力&lt;/td&gt;
&lt;td&gt;Combines multiple attention layers for better modeling.&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Positional Encoding&lt;/td&gt;
&lt;td&gt;位置编码&lt;/td&gt;
&lt;td&gt;Adds positional information to embeddings.&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Residual Connection&lt;/td&gt;
&lt;td&gt;残差连接&lt;/td&gt;
&lt;td&gt;Shortcut connections to improve gradient flow.&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Layer Normalization&lt;/td&gt;
&lt;td&gt;层归一化&lt;/td&gt;
&lt;td&gt;Stabilizes training by normalizing inputs.&lt;/td&gt;
&lt;td&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Layer_normalization&quot;&gt;Layer Normalization Details&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Feed-Forward Neural Network (FFNN)&lt;/td&gt;
&lt;td&gt;前馈神经网络&lt;/td&gt;
&lt;td&gt;Processes data independently of sequence order.&lt;/td&gt;
&lt;td&gt;&lt;a href=&quot;https://web.stanford.edu/class/cs224n/&quot;&gt;Feed-Forward Networks in NLP&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Recurrent Neural Network (RNN)&lt;/td&gt;
&lt;td&gt;循环神经网络&lt;/td&gt;
&lt;td&gt;Processes sequences step-by-step, maintaining state.&lt;/td&gt;
&lt;td&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Recurrent_neural_network&quot;&gt;RNN Basics&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Convolutional Neural Network (CNN)&lt;/td&gt;
&lt;td&gt;卷积神经网络&lt;/td&gt;
&lt;td&gt;Uses convolutions to extract features from input data.&lt;/td&gt;
&lt;td&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Convolutional_neural_network&quot;&gt;CNN Overview&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Parallelization&lt;/td&gt;
&lt;td&gt;并行化&lt;/td&gt;
&lt;td&gt;Performing multiple computations simultaneously.&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BLEU (Bilingual Evaluation Understudy)&lt;/td&gt;
&lt;td&gt;双语评估替代&lt;/td&gt;
&lt;td&gt;A metric for evaluating the accuracy of translations.&lt;/td&gt;
&lt;td&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/BLEU&quot;&gt;Understanding BLEU&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;This table provides a solid foundation for understanding the technical terms used in the &quot;Attention is All You Need&quot; paper. If you have questions or want to dive deeper into any term, the linked resources are a great place to start!&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Transformers Demystified - Day 2 - Unlocking the Genius of Self-Attention and AI&apos;s Greatest Breakthrough</title><link>https://geekcoding101.com/posts/transformers-demystified-day-2-unlocking-the-genius-of-self-attention-and-ais-greatest-breakthrough</link><guid isPermaLink="true">https://geekcoding101.com/posts/transformers-demystified-day-2-unlocking-the-genius-of-self-attention-and-ais-greatest-breakthrough</guid><pubDate>Mon, 30 Dec 2024 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Transformers are changing the AI landscape, and it all began with the groundbreaking paper &quot;Attention is All You Need.&quot; Today, I explore the &lt;strong&gt;Introduction&lt;/strong&gt; and &lt;strong&gt;Background&lt;/strong&gt; sections of the paper, uncovering the limitations of traditional RNNs, the power of self-attention, and the importance of parallelization in modern AI models. Dive in to learn how Transformers revolutionized sequence modeling and transduction tasks!&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;:::info&lt;/p&gt;
&lt;p&gt;I’ve embarked on an exciting journey to thoroughly understand the groundbreaking paper &lt;strong&gt;“Attention is All You Need.”&lt;/strong&gt; My approach is simple but thorough: each day, I focus on a specific section of the paper, breaking it down line by line to grasp every concept, idea, and nuance. Along the way, I simplify technical terms, explore references, and explain math concepts in an accessible manner. I also supplement my learning with further readings and analogies to make even the most complex topics easy to understand. This step-by-step method ensures that I not only learn but truly internalize the foundations of &lt;strong&gt;Transformers&lt;/strong&gt;, setting the stage for more advanced explorations. If you’re curious about Transformers or modern AI, join me as I unravel this revolutionary model one day at a time!&lt;/p&gt;
&lt;p&gt;:::&lt;/p&gt;
&lt;h1&gt;&lt;strong&gt;1. Introduction&lt;/strong&gt;&lt;/h1&gt;
&lt;h2&gt;&lt;strong&gt;Sentence 1:&lt;/strong&gt;&lt;/h2&gt;
&lt;blockquote&gt;
&lt;p&gt;Recurrent neural networks, long short-term memory [13] and gated recurrent [7] neural networks in particular, have been firmly established as state-of-the-art approaches in sequence modeling and transduction problems such as language modeling and machine translation [35, 2, 5].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Explanation (like for an elementary school student):&lt;/strong&gt; There are special types of AI models called &lt;strong&gt;Recurrent Neural Networks (RNNs)&lt;/strong&gt; that are like people who can remember things from the past while working on something new.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Long Short-Term Memory (LSTM)&lt;/strong&gt; and &lt;strong&gt;Gated Recurrent Units (GRUs)&lt;/strong&gt; are improved versions of RNNs.&lt;/li&gt;
&lt;li&gt;These models are the &lt;strong&gt;best performers&lt;/strong&gt; (state-of-the-art) for tasks where you need to process sequences, like predicting the next word in a sentence (&lt;strong&gt;language modeling&lt;/strong&gt;) or translating text from one language to another (&lt;strong&gt;machine translation&lt;/strong&gt;).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Key terms explained:&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Recurrent Neural Networks (RNNs):&lt;/strong&gt; Models designed to handle sequential data (like sentences, time series).
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Analogy:&lt;/em&gt; Imagine reading a book where each sentence depends on the one before it. An RNN processes the book one sentence at a time, remembering earlier ones.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Further Reading:&lt;/strong&gt; &lt;a href=&quot;https://en.wikipedia.org/wiki/Recurrent_neural_network&quot;&gt;RNNs on Wikipedia&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Long Short-Term Memory (LSTM):&lt;/strong&gt; A type of RNN that solves the problem of forgetting important past information.
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Analogy:&lt;/em&gt; LSTMs are like a memory-keeper that knows what’s important to remember and what to forget.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Further Reading:&lt;/strong&gt; &lt;a href=&quot;https://en.wikipedia.org/wiki/Long_short-term_memory&quot;&gt;LSTM on Wikipedia&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Gated Recurrent Units (GRUs):&lt;/strong&gt; A simpler version of LSTM, with fewer memory-related functions.
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Further Reading:&lt;/strong&gt; &lt;a href=&quot;https://en.wikipedia.org/wiki/Gated_recurrent_unit&quot;&gt;GRU Details&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Sequence Modeling and Transduction:&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Sequence Modeling:&lt;/strong&gt; Tasks like predicting the next word in a sentence.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Sequence Transduction:&lt;/strong&gt; Tasks like translating sentences into another language or converting text to speech.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Further Reading:&lt;/strong&gt; &lt;a href=&quot;https://arxiv.org/abs/1511.06114&quot;&gt;Sequence Transduction Paper&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;References explained:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;[13]&lt;/strong&gt; Hochreiter &amp;amp; Schmidhuber (1997): Introduced LSTMs.
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Link:&lt;/strong&gt; LSTM Original Paper&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;[7]&lt;/strong&gt; Chung et al. (2014): Evaluated GRUs.
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Link:&lt;/strong&gt; &lt;a href=&quot;https://arxiv.org/abs/1412.3555&quot;&gt;GRU Evaluation Paper&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;[35, 2, 5]:&lt;/strong&gt; Machine translation and language modeling using RNNs.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h4&gt;&lt;strong&gt;Sentence 2:&lt;/strong&gt;&lt;/h4&gt;
&lt;blockquote&gt;
&lt;p&gt;Numerous efforts have since continued to push the boundaries of recurrent language models and encoder-decoder architectures [38, 24, 15].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt; Over time, researchers have been working hard to make RNNs even better. They focused on:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Recurrent language models:&lt;/strong&gt; Making RNNs predict words more accurately.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Encoder-Decoder architectures:&lt;/strong&gt; A setup where one model (encoder) processes the input, and another model (decoder) generates the output (like translation).&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Key terms explained:&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Encoder-Decoder Architecture:&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;The encoder compresses the input into a smaller representation (like summarizing).&lt;/li&gt;
&lt;li&gt;The decoder uses this compressed information to generate the output.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Analogy:&lt;/em&gt; Like translating English to French — first understanding the English text, then generating the French version.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Further Reading:&lt;/strong&gt; Encoder-Decoder Overview&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;References explained:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;[38] Wu et al. (2016):&lt;/strong&gt; Explored Google’s Neural Machine Translation (GNMT) using encoder-decoder architectures.
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Link:&lt;/strong&gt; &lt;a href=&quot;https://arxiv.org/abs/1609.08144&quot;&gt;GNMT Paper&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;[24] Luong et al. (2015):&lt;/strong&gt; Studied effective approaches to attention in neural machine translation.
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Link:&lt;/strong&gt; &lt;a href=&quot;https://arxiv.org/abs/1508.04025&quot;&gt;Luong Attention Paper&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;[15] Jozefowicz et al. (2016):&lt;/strong&gt; Studied language modeling limits.
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Link:&lt;/strong&gt; &lt;a href=&quot;https://arxiv.org/abs/1602.02410&quot;&gt;Language Model Study&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h4&gt;&lt;strong&gt;Sentence 3:&lt;/strong&gt;&lt;/h4&gt;
&lt;blockquote&gt;
&lt;p&gt;Recurrent models typically factor computation along the symbol positions of the input and output sequences.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt; RNNs handle input/output one step at a time:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Input symbols:&lt;/em&gt; Letters, words, or parts of words in a sentence.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Factor computation:&lt;/em&gt; RNNs calculate each part of the sequence (e.g., one word) in a fixed order.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h4&gt;&lt;strong&gt;Sentence 4:&lt;/strong&gt;&lt;/h4&gt;
&lt;blockquote&gt;
&lt;p&gt;Aligning the positions to steps in computation time, they generate a sequence of hidden states hth_t, as a function of the previous hidden state ht−1h_{t-1}​ and the input for position tt.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt; RNNs have a hidden memory state (hth_t) that stores what it has learned so far:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;For each position (tt):
&lt;ul&gt;
&lt;li&gt;Use the previous memory (ht−1h_{t-1}).&lt;/li&gt;
&lt;li&gt;Add new input information for position tt.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Math Representation:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;$$h_t = f(h_{t-1}, x_t)$$&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;$h_t$: Hidden state at time $t$.&lt;/li&gt;
&lt;li&gt;$h_{t-1}$: Previous hidden state.&lt;/li&gt;
&lt;li&gt;$x_t$: Input at time $t$.&lt;/li&gt;
&lt;li&gt;$f$: Function combining these.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;em&gt;Analogy:&lt;/em&gt; Think of $h_t$ as a diary where you write today’s experiences based on yesterday’s memories.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;Sentence 5:&lt;/strong&gt;&lt;/h4&gt;
&lt;blockquote&gt;
&lt;p&gt;This inherently sequential nature precludes parallelization within training examples, which becomes critical at longer sequence lengths, as memory constraints limit batching across examples.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Since RNNs process sequences step-by-step (sequentially), they can’t do multiple steps at the same time (no parallelization).&lt;/li&gt;
&lt;li&gt;This is a problem for long sequences because:
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Memory limits&lt;/strong&gt;: You can’t train many sequences together (batching is limited).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Time cost&lt;/strong&gt;: Processing each step one at a time is slow.&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;em&gt;Analogy:&lt;/em&gt; Imagine reading a book one sentence at a time vs. scanning multiple pages in parallel. RNNs are like the first method \u2014 slow and memory-hungry for large books.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Why this is a problem:&lt;/strong&gt; In real-world tasks like translation, sentences can be very long, making RNNs less efficient.&lt;/p&gt;
&lt;hr /&gt;
&lt;h4&gt;&lt;strong&gt;Sentence 6:&lt;/strong&gt;&lt;/h4&gt;
&lt;blockquote&gt;
&lt;p&gt;Recent work has achieved significant improvements in computational efficiency through factorization tricks [21] and conditional computation [32], while also improving model performance in case of the latter.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt; Some researchers found clever ways to make RNNs faster and better:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Factorization tricks:&lt;/strong&gt; These simplify calculations to save time.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Conditional computation:&lt;/strong&gt; This focuses on only the important parts of the sequence, skipping unnecessary work.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;References explained:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;[21] Factorization Tricks:&lt;/strong&gt; Simplifies computations in LSTMs for faster training.
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Link:&lt;/strong&gt; &lt;a href=&quot;https://arxiv.org/abs/1703.10722&quot;&gt;Factorization Tricks Paper&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;[32] Conditional Computation:&lt;/strong&gt; Introduced sparsely gated mixture-of-experts layers, improving efficiency.
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Link:&lt;/strong&gt; &lt;a href=&quot;https://arxiv.org/abs/1701.06538&quot;&gt;Conditional Computation Paper&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h4&gt;&lt;strong&gt;Sentence 7:&lt;/strong&gt;&lt;/h4&gt;
&lt;blockquote&gt;
&lt;p&gt;The fundamental constraint of sequential computation, however, remains.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt; Even with improvements, RNNs still can’t avoid processing sequences step-by-step. This sequential nature is their biggest limitation.&lt;/p&gt;
&lt;hr /&gt;
&lt;h4&gt;&lt;strong&gt;Sentence 8:&lt;/strong&gt;&lt;/h4&gt;
&lt;blockquote&gt;
&lt;p&gt;Attention mechanisms have become an integral part of compelling sequence modeling and transduction models in various tasks, allowing modeling of dependencies without regard to their distance in the input or output sequences [2, 19].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Attention mechanisms&lt;/strong&gt; are like a smart highlight tool that helps models focus on the most important parts of the input.&lt;/li&gt;
&lt;li&gt;The big advantage? Attention doesn’t care how far apart the related elements are in a sequence (e.g., the first and last words of a long sentence).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;References explained:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;[2] Bahdanau et al. (2014):&lt;/strong&gt; Introduced attention in neural machine translation.
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Link:&lt;/strong&gt; &lt;a href=&quot;https://arxiv.org/abs/1409.0473&quot;&gt;Bahdanau Attention Paper&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;[19] Kim et al. (2017):&lt;/strong&gt; Explored structured attention networks.
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Link:&lt;/strong&gt; &lt;a href=&quot;https://arxiv.org/abs/1702.00887&quot;&gt;Structured Attention Networks Paper&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h4&gt;&lt;strong&gt;Sentence 9:&lt;/strong&gt;&lt;/h4&gt;
&lt;blockquote&gt;
&lt;p&gt;In all but a few cases [27], however, such attention mechanisms are used in conjunction with a recurrent network.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt; Most models use attention &lt;strong&gt;with&lt;/strong&gt; RNNs (as an extra feature) instead of replacing the RNN completely.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Reference explained:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;[27] Parikh et al. (2016):&lt;/strong&gt; Proposed a decomposable attention model without recurrence.
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Link:&lt;/strong&gt; &lt;a href=&quot;https://arxiv.org/abs/1606.01933&quot;&gt;Decomposable Attention Model Paper&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h4&gt;&lt;strong&gt;Sentence 10:&lt;/strong&gt;&lt;/h4&gt;
&lt;blockquote&gt;
&lt;p&gt;In this work we propose the Transformer, a model architecture eschewing recurrence and instead relying entirely on an attention mechanism to draw global dependencies between input and output.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt; The &lt;strong&gt;Transformer&lt;/strong&gt; is a new model that:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Removes recurrence:&lt;/strong&gt; No RNNs are used at all.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Uses only attention:&lt;/strong&gt; Attention mechanisms handle all the work of relating input and output sequences.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Why it’s exciting:&lt;/strong&gt; This design solves the problems of RNNs (sequential processing and memory issues) while keeping the ability to model relationships in long sequences.&lt;/p&gt;
&lt;hr /&gt;
&lt;h4&gt;&lt;strong&gt;Sentence 11:&lt;/strong&gt;&lt;/h4&gt;
&lt;blockquote&gt;
&lt;p&gt;The Transformer allows for significantly more parallelization and can reach a new state of the art in translation quality after being trained for as little as twelve hours on eight P100 GPUs.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The Transformer is &lt;strong&gt;fast&lt;/strong&gt; because it processes sequences in parallel.&lt;/li&gt;
&lt;li&gt;In experiments, it achieved top performance in translation tasks with just 12 hours of training on 8 GPUs (powerful processors).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Key takeaway:&lt;/strong&gt; The Transformer is faster, more efficient, and achieves better results than traditional models.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;2. Background&lt;/strong&gt;&lt;/h3&gt;
&lt;h4&gt;&lt;strong&gt;Sentence 1:&lt;/strong&gt;&lt;/h4&gt;
&lt;blockquote&gt;
&lt;p&gt;The goal of reducing sequential computation also forms the foundation of the Extended Neural GPU [16], ByteNet [18] and ConvS2S [9], all of which use convolutional neural networks as basic building block, computing hidden representations in parallel for all input and output positions.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt; Some models before the Transformer also tried to solve the problem of sequential processing:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Extended Neural GPU:&lt;/strong&gt; Uses neural networks for faster calculations.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;ByteNet:&lt;/strong&gt; Uses convolutions to process sequences in parallel.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;ConvS2S:&lt;/strong&gt; Combines convolutions with sequence modeling.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Why they matter:&lt;/strong&gt; These models inspired the Transformer by showing that parallelization could work.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;References explained:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;[16] Extended Neural GPU:&lt;/strong&gt; Explored memory-efficient computations.
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Link:&lt;/strong&gt; &lt;a href=&quot;https://arxiv.org/abs/1607.00036&quot;&gt;Extended Neural GPU Paper&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;[18] ByteNet:&lt;/strong&gt; Introduced logarithmic efficiency for sequence processing.
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Link:&lt;/strong&gt; &lt;a href=&quot;https://arxiv.org/abs/1610.10099&quot;&gt;ByteNet Paper&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;[9] ConvS2S:&lt;/strong&gt; Used convolutions for sequence-to-sequence learning.
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Link:&lt;/strong&gt; &lt;a href=&quot;https://arxiv.org/abs/1705.03122&quot;&gt;ConvS2S Paper&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h4&gt;&lt;strong&gt;Sentence 2:&lt;/strong&gt;&lt;/h4&gt;
&lt;blockquote&gt;
&lt;p&gt;In these models, the number of operations required to relate signals from two arbitrary input or output positions grows in the distance between positions, linearly for ConvS2S and logarithmically for ByteNet.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;For models like ByteNet and ConvS2S, the farther apart two elements in a sequence are, the more operations are needed to relate them.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;ConvS2S:&lt;/strong&gt; Operations increase &lt;strong&gt;linearly&lt;/strong&gt; (slow for long sequences).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;ByteNet:&lt;/strong&gt; Operations increase &lt;strong&gt;logarithmically&lt;/strong&gt; (faster but still depends on distance).&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;Sentence 3:&lt;/strong&gt;&lt;/h4&gt;
&lt;blockquote&gt;
&lt;p&gt;This makes it more difficult to learn dependencies between distant positions [12].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;In models like ConvS2S and ByteNet, the more operations needed to relate distant parts of a sequence, the harder it is for the model to learn meaningful relationships between those parts.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why it matters:&lt;/strong&gt; For tasks like translation, where the first and last words of a sentence may be closely connected, this limitation is a big problem.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Reference explained:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;[12] Hochreiter et al. (2001):&lt;/strong&gt; This paper explains the challenges of learning long-term dependencies in sequences due to gradient-related issues in recurrent models.
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Link:&lt;/strong&gt; Gradient Flow Paper&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h4&gt;&lt;strong&gt;Sentence 4:&lt;/strong&gt;&lt;/h4&gt;
&lt;blockquote&gt;
&lt;p&gt;In the Transformer this is reduced to a constant number of operations, albeit at the cost of reduced effective resolution due to averaging attention-weighted positions, an effect we counteract with Multi-Head Attention as described in section 3.2.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The Transformer solves the dependency problem by requiring only a &lt;strong&gt;constant number of operations&lt;/strong&gt; to relate any two positions in a sequence.
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Analogy:&lt;/em&gt; Think of a Transformer as a direct highway between every pair of cities, instead of needing to stop at every town along the way like in RNNs or ConvS2S.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Averaging Attention-Weighted Positions:&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;Attention assigns a “weight” to each position in the sequence to decide how important it is.&lt;/li&gt;
&lt;li&gt;Averaging these weights reduces the ability to capture fine-grained details, like losing sharpness in a photo.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Multi-Head Attention:&lt;/strong&gt; The Transformer fixes this by using multiple attention mechanisms (heads), which we’ll cover in section 3.2.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;&lt;strong&gt;Math Explanation for Operations&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Let’s break down the &lt;strong&gt;constant vs. linear vs. logarithmic growth&lt;/strong&gt; using simple terms and math.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;ConvS2S (Linear Growth):&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;To relate two distant elements, ConvS2S needs $O(d)$ operations, where $d$ is the distance between them.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Example:&lt;/em&gt; If $d=10$, ConvS2S needs 10 operations. If $d=100$, it needs 100 operations.&lt;/li&gt;
&lt;li&gt;Linear growth means: The cost increases directly with the distance.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;ByteNet (Logarithmic Growth):&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;ByteNet improves this with $O(\log(d))$ operations.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Example:&lt;/em&gt; If $d=10$, it might need about 3 operations (since $\log_2(10) \approx 3$).&lt;/li&gt;
&lt;li&gt;Logarithmic growth means: The cost increases slowly as the distance grows.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Transformer (Constant Growth):&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;The Transformer needs only $O(1)$ operations, regardless of distance $d$.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Example:&lt;/em&gt; Whether $d=10$ or $d=1000$, the cost stays the same.&lt;/li&gt;
&lt;li&gt;This is because attention mechanisms compare all positions simultaneously.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Why it matters:&lt;/strong&gt; Constant-time operations make the Transformer much faster and scalable for long sequences.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;Sentence 5:&lt;/strong&gt;&lt;/h4&gt;
&lt;blockquote&gt;
&lt;p&gt;Self-attention, sometimes called intra-attention, is an attention mechanism relating different positions of a single sequence in order to compute a representation of the sequence.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Self-Attention:&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;A mechanism where a model focuses on relationships within the same sequence (e.g., relating the subject of a sentence to its verb).&lt;/li&gt;
&lt;li&gt;It’s like looking at a single document and marking connections between sentences to summarize its meaning.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Representation of the Sequence:&lt;/strong&gt; The output of self-attention is a compact representation that captures all the important information about the sequence.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Further Reading:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://towardsdatascience.com/illustrated-self-attention-2d627e33b20a&quot;&gt;Understanding Self-Attention&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h4&gt;&lt;strong&gt;Sentence 6:&lt;/strong&gt;&lt;/h4&gt;
&lt;blockquote&gt;
&lt;p&gt;Self-attention has been used successfully in a variety of tasks including reading comprehension, abstractive summarization, textual entailment and learning task-independent sentence representations [4, 27, 28, 22].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt; Self-attention is powerful and versatile. It has been used in:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Reading comprehension:&lt;/strong&gt; Understanding and answering questions about a passage.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Abstractive summarization:&lt;/strong&gt; Summarizing content by rewriting it in new words.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Textual entailment:&lt;/strong&gt; Determining if one sentence logically follows another.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Task-independent sentence representations:&lt;/strong&gt; Creating general-purpose sentence embeddings for use in different tasks.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;References explained:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;[4] Cheng et al. (2016):&lt;/strong&gt; Used LSTMs for machine reading.
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Link:&lt;/strong&gt; &lt;a href=&quot;https://arxiv.org/abs/1601.06733&quot;&gt;Machine Reading Paper&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;[27] Parikh et al. (2016):&lt;/strong&gt; Proposed attention-based models without recurrence.
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Link:&lt;/strong&gt; &lt;a href=&quot;https://arxiv.org/abs/1606.01933&quot;&gt;Decomposable Attention Paper&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;[28] Paulus et al. (2017):&lt;/strong&gt; Applied reinforcement learning for summarization.
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Link:&lt;/strong&gt; &lt;a href=&quot;https://arxiv.org/abs/1705.04304&quot;&gt;Summarization Paper&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;[22] Lin et al. (2017):&lt;/strong&gt; Explored structured self-attentive embeddings.
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Link:&lt;/strong&gt; &lt;a href=&quot;https://arxiv.org/abs/1703.03130&quot;&gt;Self-Attentive Embeddings Paper&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h4&gt;&lt;strong&gt;Sentence 7:&lt;/strong&gt;&lt;/h4&gt;
&lt;blockquote&gt;
&lt;p&gt;End-to-end memory networks are based on a recurrent attention mechanism instead of sequence-aligned recurrence and have been shown to perform well on simple-language question answering and language modeling tasks [34].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;End-to-End Memory Networks:&lt;/strong&gt; A model that combines attention and memory for tasks like answering questions.&lt;/li&gt;
&lt;li&gt;Instead of processing sequences step-by-step like RNNs, these models use attention mechanisms to focus on relevant information in memory.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Use Cases:&lt;/strong&gt; Simple question answering and language modeling (predicting sentences).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Reference explained:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;[34] Sukhbaatar et al. (2015):&lt;/strong&gt; Proposed memory networks for reasoning tasks.
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Link:&lt;/strong&gt; &lt;a href=&quot;https://arxiv.org/abs/1503.08895&quot;&gt;Memory Networks Paper&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h4&gt;&lt;strong&gt;Sentence 8:&lt;/strong&gt;&lt;/h4&gt;
&lt;blockquote&gt;
&lt;p&gt;To the best of our knowledge, however, the Transformer is the first transduction model relying entirely on self-attention to compute representations of its input and output without using sequence-aligned RNNs or convolution.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt; The Transformer is unique because:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;It’s the &lt;strong&gt;first model&lt;/strong&gt; to rely &lt;strong&gt;completely&lt;/strong&gt; on self-attention.&lt;/li&gt;
&lt;li&gt;It doesn’t use RNNs or convolution at all, unlike earlier models.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Key takeaway:&lt;/strong&gt; This makes the Transformer faster, simpler, and more scalable than its predecessors.&lt;/p&gt;
&lt;hr /&gt;
&lt;h4&gt;&lt;strong&gt;Sentence 9:&lt;/strong&gt;&lt;/h4&gt;
&lt;blockquote&gt;
&lt;p&gt;In the following sections, we will describe the Transformer, motivate self-attention and discuss its advantages over models such as [17, 18] and [9].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt; The next parts of the paper will cover:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;How the Transformer works&lt;/strong&gt; (architecture).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Why self-attention is important&lt;/strong&gt; (motivation).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Comparison with older models&lt;/strong&gt; (e.g., Neural GPU, ByteNet, ConvS2S).&lt;/li&gt;
&lt;/ol&gt;
&lt;hr /&gt;
&lt;p&gt;Today, we explored the &lt;strong&gt;Introduction&lt;/strong&gt; and &lt;strong&gt;Background&lt;/strong&gt; sections of the revolutionary paper &lt;strong&gt;“Attention is All You Need.”&lt;/strong&gt; From understanding the limitations of RNNs to discovering the power of self-attention and parallelization, it’s clear why Transformers are a game-changer in the world of AI. These foundational insights set the stage for the next step in our journey: diving into the &lt;strong&gt;Transformer Architecture&lt;/strong&gt; itself. Tomorrow, I’ll delve into the mechanics of self-attention, multi-head attention, and positional encoding.&lt;/p&gt;
&lt;p&gt;Stay tuned as we continue to uncover the brilliance behind this landmark model!&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Ultimate Kubernetes Tutorial Part 1: Setting Up a Thriving Multi-Node Cluster on Mac</title><link>https://geekcoding101.com/posts/kubernetes-tutorial-part1</link><guid isPermaLink="true">https://geekcoding101.com/posts/kubernetes-tutorial-part1</guid><pubDate>Sat, 01 Mar 2025 00:00:00 GMT</pubDate><content:encoded>&lt;h1&gt;&lt;strong&gt;Introduction&lt;/strong&gt;&lt;/h1&gt;
&lt;p&gt;Hey there! Welcome to this Kubernetes tutorial! Ever dreamed of running a real multi-node &lt;a href=&quot;https://kubernetes.io/&quot;&gt;Kubernetes (K8s) cluster&lt;/a&gt; on your laptop instead of settling for Minikube’s diet version? A proper real multi-node Kubernetes environment requires virtual machines, and until last year, VMware Fusion was a paid product—an obstacle for many. I know there are alternatives, like &lt;a href=&quot;https://www.linux-kvm.org/page/Downloads&quot;&gt;KVM&lt;/a&gt;, &lt;a href=&quot;https://www.virtualbox.org/&quot;&gt;Oracle VirtualBox&lt;/a&gt;, and even Minikube’s so-called multi-node mode ----but let’s be real: I’ve got a beast of a MacBook Pro, so why not flex its muscles and spin up a legit multi-node cluster? 🚀&lt;/p&gt;
&lt;p&gt;But great news! &lt;strong&gt;On November 11, 2024, &lt;a href=&quot;https://blogs.vmware.com/cloud-foundation/2024/11/11/vmware-fusion-and-workstation-are-now-free-for-all-users/&quot;&gt;VMware announced that Fusion and Workstation are now free for all users&lt;/a&gt;!&lt;/strong&gt; The moment I stumbled upon this announcement, I was thrilled. Time to roll up my sleeves, fire up some VMs, and make this cluster a reality. Kick off my Kubernetes tutorial! Let’s dive in! 🚀&lt;/p&gt;
&lt;hr /&gt;
&lt;h1&gt;&lt;strong&gt;Project Overview&lt;/strong&gt;&lt;/h1&gt;
&lt;h2&gt;&lt;strong&gt;My Goal&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;In this series of Kubernetes tutorial, I want to set up a full Kubernetes cluster on my MacBook Pro using &lt;strong&gt;VMware Fusion&lt;/strong&gt;, creating multiple VMs to simulate real-world deployment and practice my DevOps and IaC (Infrastructure as Code) skills.&lt;/p&gt;
&lt;h2&gt;&lt;strong&gt;Planned Setup&lt;/strong&gt;&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Create a VM as Base VM (Rocky Linux 9)&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Configure networking&lt;/li&gt;
&lt;li&gt;Update system packages&lt;/li&gt;
&lt;li&gt;Disable &lt;code&gt;firewalld&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Enable SSH passwordless login from local Mac to the base VM&lt;/li&gt;
&lt;li&gt;Set up &lt;code&gt;zsh&lt;/code&gt;, &lt;code&gt;tmux&lt;/code&gt;, &lt;code&gt;vim&lt;/code&gt; and common aliases&lt;/li&gt;
&lt;li&gt;Install &lt;strong&gt;Miniforge&lt;/strong&gt; for Python environment management&lt;/li&gt;
&lt;li&gt;Install and configure &lt;strong&gt;Ansible&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Set up a Local Server Node (&lt;code&gt;localserver&lt;/code&gt;)&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Clone from the above base VM image&lt;/li&gt;
&lt;li&gt;Create an Ansible script to customize the base VM image withe new hostname, SSH keys, and networking&lt;/li&gt;
&lt;li&gt;Set up &lt;strong&gt;DNS and NTP servers&lt;/strong&gt; as our internal hostname resolution and local time sync up&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Create Kubernetes Nodes (&lt;code&gt;k8s-1&lt;/code&gt; to &lt;code&gt;k8s-4&lt;/code&gt;)&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Clone from the base image&lt;/li&gt;
&lt;li&gt;Using the same Ansible script to customize new VMs&apos; hostname, SSH keys, and networking&lt;/li&gt;
&lt;li&gt;Install core Kubernetes packages (&lt;code&gt;containerd&lt;/code&gt;, &lt;code&gt;kubelet&lt;/code&gt;, &lt;code&gt;kubeadm&lt;/code&gt;, &lt;code&gt;kubectl&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Enable &lt;code&gt;firewalld&lt;/code&gt; and open necessary ports (Yes! Many online articles disable firewalld setup in their tutorils, but I want lift the bar! Get it work with iptables like a production environment!)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Cluster Formation&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Setup &lt;code&gt;k8s-1&lt;/code&gt; as Master Node with Flannel as CNI plugin&lt;/li&gt;
&lt;li&gt;Setup &lt;code&gt;k8s-2, k8s-3, k8s-4&lt;/code&gt; as Worker Nodes and join the cluster&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Test/Deploy Nginx Service into Cluster via NodePort&lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Setup Kubernetes Cluster Dashboard&lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;More is coming!&lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;hr /&gt;
&lt;h2&gt;&lt;strong&gt;Networking&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;In the envrionment used in this Kubernetes tutorial, each VM will have &lt;strong&gt;two network interfaces&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;ens160&lt;/code&gt; → Connected to &lt;code&gt;vmnet2&lt;/code&gt; (private network created on VMFusion for Kubernetes I will talk later: &lt;code&gt;172.16.211.0/24&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ens224&lt;/code&gt; → Shared with Mac for Internet access.&lt;/li&gt;
&lt;/ul&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Hostname&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;th&gt;IP Address (ens160)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;localserver&lt;/td&gt;
&lt;td&gt;DNS Server, NTPServer&lt;/td&gt;
&lt;td&gt;&lt;code&gt;172.16.211.100/24&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;k8s-1&lt;/td&gt;
&lt;td&gt;Master&lt;/td&gt;
&lt;td&gt;&lt;code&gt;172.16.211.11/24&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;k8s-2&lt;/td&gt;
&lt;td&gt;Worker&lt;/td&gt;
&lt;td&gt;&lt;code&gt;172.16.211.12/24&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;k8s-3&lt;/td&gt;
&lt;td&gt;Worker&lt;/td&gt;
&lt;td&gt;&lt;code&gt;172.16.211.13/24&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;k8s-4&lt;/td&gt;
&lt;td&gt;Worker&lt;/td&gt;
&lt;td&gt;&lt;code&gt;172.16.211.14/24&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr /&gt;
&lt;h1&gt;&lt;strong&gt;Creating the Rocky 9 Base VM&lt;/strong&gt;&lt;/h1&gt;
&lt;h2&gt;&lt;strong&gt;Configure a Custom Network in VMFusion&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;I hope you’ve already installed VMware Fusion—that part is straightforward.&lt;/p&gt;
&lt;p&gt;To create an isolated network among VMs for Kubernetes:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Open &lt;strong&gt;VMware Fusion&lt;/strong&gt; → &lt;strong&gt;Preferences&lt;/strong&gt; → &lt;strong&gt;Network&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Add a new network (&lt;code&gt;vmnet2&lt;/code&gt;) &lt;img src=&quot;./image.png&quot; alt=&quot;Kubernetes tutorial part 1: network for nodes, created in VMFusion&quot; title=&quot;Kubernetes tutorial part 1: network for nodes, created in VMFusion&quot; /&gt;&lt;/li&gt;
&lt;li&gt;Uncheck &lt;strong&gt;&quot;Provide addresses on this network via DHCP&quot;&lt;/strong&gt; (as we’ll use static IPs)&lt;/li&gt;
&lt;/ol&gt;
&lt;hr /&gt;
&lt;h2&gt;Configure the Internet Network in VMFusion&lt;/h2&gt;
&lt;p&gt;This is straighgtforward, add it for each node as below:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./internet-access-network-818x1024.webp&quot; alt=&quot;Kubernetes tutorial part 1: internet access network&quot; title=&quot;Kubernetes tutorial part 1: internet access network&quot; /&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;&lt;strong&gt;Create the Base VM and Install Rocky Linux 9&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;I used to work with CentOS and love it, since CentOS 9 was discontinued at the end of 2021, Rocky Linux was announced as a replacement for it. So I will setup Kubernetes on Rocky Linux 9.&lt;/p&gt;
&lt;p&gt;You can download the ISO from &lt;a href=&quot;https://rockylinux.org/download&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;:::note&lt;/p&gt;
&lt;p&gt;Please bear with me. It&apos;s a long article, but it&apos;s fun! Hope you will like my Kubernetes tutorial soon!&lt;/p&gt;
&lt;p&gt;:::&lt;/p&gt;
&lt;p&gt;For me, in this kubernetes tutorial, my macboork is Intel, so I used Intel arch ISO and downloaded the DVD ISO, NOT the minimal ISO or boot ISO:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./rocky-linux-9-iso-download.webp&quot; alt=&quot;Kubernetes tutorial part 1: rocky linux 9 iso download page screenshot&quot; title=&quot;Kubernetes tutorial part 1: rocky linux 9 iso download page screenshot&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Create a new VM from VMFusion and select the ISO to start installation.&lt;/p&gt;
&lt;p&gt;During the Rocky 9 installation, manually set:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;hostname: baseimage&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;password of &lt;code&gt;root&lt;/code&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Create a user account &lt;code&gt;admin&lt;/code&gt; and make it as the user administrator&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;IP Address&lt;/strong&gt;: &lt;code&gt;172.16.211.3/24&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;DNS Server&lt;/strong&gt;:&lt;code&gt;172.16.211.100&lt;/code&gt; , &lt;code&gt;8.8.8.8&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Search Domain&lt;/strong&gt;: &lt;code&gt;dev.geekcoding101local.com&lt;/code&gt; (We will configure this domain later in localserver VM)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;strong&gt;Pointing to NTP server running on localserver(&lt;/strong&gt;&lt;/strong&gt;&lt;code&gt;172.16.211.100&lt;/code&gt;&lt;strong&gt;) which we will setup later.&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A few screenshots:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-1.png&quot; alt=&quot;Kubernetes tutorial part 1: create admin user during ISO installation&quot; title=&quot;Kubernetes tutorial part 1: create admin user during ISO installation&quot; /&gt;&lt;/p&gt;
&lt;p&gt;(I added the &lt;code&gt;ens224&lt;/code&gt; network adapter post the ISO installation, that&apos;s why it&apos;s not shown in below)&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-2.png&quot; alt=&quot;Kubernetes tutorial part 1: Configure network during ISO installation&quot; title=&quot;Kubernetes tutorial part 1: Configure network during ISO installation&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-3.png&quot; alt=&quot;Kubernetes tutorial part 1: pointing to NTP server during iso installation&quot; title=&quot;Kubernetes tutorial part 1: pointing to NTP server during iso installation&quot; /&gt;&lt;/p&gt;
&lt;p&gt;If you forgot to configure DNS during installation, update it via command line post installation:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;nmcli con mod ens160 ipv4.dns &quot;172.16.211.100 8.8.8.8 8.8.8.4&quot;
nmcli con mod ens160 ipv4.dns-search &quot;dev.geekcoding101local.com&quot;
nmcli con mod ens160 ipv4.ignore-auto-dns yes
nmcli con up ens160
nmcli dev show ens160

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Once the &lt;strong&gt;DNS server (&lt;code&gt;172.16.211.100&lt;/code&gt;) on &lt;code&gt;localserver&lt;/code&gt;&lt;/strong&gt; is up, you should be able to resolve hostnames:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;nslookup baseimage
nslookup baseimage.dev.geekcoding101local.com
hostname -f
hostname -s
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;[infobox title=&quot;Tips&quot;]&lt;/p&gt;
&lt;h2&gt;&lt;strong&gt;Tips: Network Interface Names&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;The network adapter name &lt;code&gt;ens160&lt;/code&gt; in Rocky 9 is assigned based on &lt;strong&gt;Predictable Network Interface Names&lt;/strong&gt; (PNIN), a naming convention introduced in &lt;strong&gt;systemd v197&lt;/strong&gt; to ensure stable and predictable interface names across reboots and hardware changes. The name &lt;code&gt;ens160&lt;/code&gt; specifically follows the &lt;strong&gt;&quot;Firmware/BIOS Index-based Naming&quot;&lt;/strong&gt; scheme, where:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;e&lt;/code&gt; stands for &lt;strong&gt;Ethernet&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;n&lt;/code&gt; indicates it&apos;s a &lt;strong&gt;network device&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;s160&lt;/code&gt; refers to the &lt;strong&gt;firmware (BIOS/UEFI) assigned index&lt;/strong&gt;, which is based on how the hypervisor or hardware presents the device.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Why is it &lt;code&gt;ens160&lt;/code&gt;?&lt;/h3&gt;
&lt;p&gt;On &lt;strong&gt;VMware&lt;/strong&gt;, the &lt;code&gt;ens160&lt;/code&gt; interface name is commonly assigned because VMware presents the &lt;strong&gt;first&lt;/strong&gt; virtual NIC with firmware index &lt;code&gt;160&lt;/code&gt;. This is specific to VMware’s implementation.&lt;/p&gt;
&lt;h3&gt;Is it Consistent Across All Rocky Linux 9 Installs?&lt;/h3&gt;
&lt;p&gt;Not necessarily. The naming depends on the hardware and hypervisor:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;VMware&lt;/strong&gt;: The first NIC is typically named &lt;code&gt;ens160&lt;/code&gt; because of VMware’s firmware enumeration.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Physical Machines&lt;/strong&gt;: The first NIC may be named &lt;code&gt;ens3&lt;/code&gt;, &lt;code&gt;ens5f0&lt;/code&gt;, &lt;code&gt;enp1s0&lt;/code&gt;, &lt;code&gt;eno1&lt;/code&gt;, etc., depending on:
&lt;ul&gt;
&lt;li&gt;PCI bus topology (&lt;code&gt;enpXYSZ&lt;/code&gt; for PCI enumeration).&lt;/li&gt;
&lt;li&gt;Onboard NICs (&lt;code&gt;enoX&lt;/code&gt; for motherboard NICs).&lt;/li&gt;
&lt;li&gt;BIOS/firmware-assigned index (&lt;code&gt;ensX&lt;/code&gt; for BIOS indexing).&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Other Hypervisors&lt;/strong&gt;:
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;KVM/QEMU&lt;/strong&gt;: Uses &lt;code&gt;ens3&lt;/code&gt; or &lt;code&gt;enp1s0&lt;/code&gt; (based on PCI bus mapping).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Hyper-V&lt;/strong&gt;: Uses &lt;code&gt;eth0&lt;/code&gt; or &lt;code&gt;ensX&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;Can You Change It?&lt;/h3&gt;
&lt;p&gt;Yes, if you want to ensure consistent naming across environments, you can override it using:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;udev rules&lt;/strong&gt; (&lt;code&gt;/etc/udev/rules.d/70-persistent-net.rules&lt;/code&gt;)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;GRUB kernel parameters&lt;/strong&gt; (disable PNIN):&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;grubby --update-kernel=ALL --args=&quot;net.ifnames=0 biosdevname=0&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This will revert to &lt;code&gt;eth0&lt;/code&gt;, &lt;code&gt;eth1&lt;/code&gt;, etc.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This is out of the scope of our current blog post, feel free to add a comment if you want to see a post about how to override it.&lt;/p&gt;
&lt;p&gt;[/infobox]&lt;/p&gt;
&lt;p&gt;:::info&lt;/p&gt;
&lt;p&gt;Do you like above style of &lt;code&gt;Tips&lt;/code&gt;? Hope so! I will test out this format in this Kubernetes tutorial, let me know!&lt;/p&gt;
&lt;p&gt;:::&lt;/p&gt;
&lt;p&gt;Once system is up, let&apos;s disable firewalld, obvisouly I don&apos;t stuck due to any firewall issue as a base VM image (We will turn it on when setting up Kubernetes cluster):&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;systemctl stop firewalld
systemctl disable firewalld
systemctl mask firewalld
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;disable&lt;/code&gt;&lt;/strong&gt;: Disables the service from starting automatically at boot but doesn&apos;t prevent manual starts.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;mask&lt;/code&gt;&lt;/strong&gt;: Prevents the service from being started manually or automatically by creating a link to &lt;code&gt;/dev/null&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Thanks for your reading! I hope you enjoy my kubernetes tutorial so far!&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Update Packages&lt;/h2&gt;
&lt;p&gt;Here I installed my favorite packages/tools, including vim, tmux, zsh and etc.&lt;/p&gt;
&lt;p&gt;You can add your own essentials tools in below list so that you can get it on every new VM cloned from this base VM image:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;dnf update -y
dnf install vim wget git tmux perl-Time-HiRes bind-utils util-linux-user zsh -y
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;perl-Time-HiRes&lt;/code&gt;: required by tmux to show time.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;bind-utils&lt;/code&gt;: provides nslookup and other DNS related tools&lt;/li&gt;
&lt;li&gt;&lt;code&gt;util-linux-user&lt;/code&gt;: provides chsh&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h2&gt;Setup password-less SSH authentication from local to VM&lt;/h2&gt;
&lt;p&gt;I love this! It&apos;s a must for any development environment settings!&lt;/p&gt;
&lt;p&gt;It&apos;s so annoying if you need to type password at every login!&lt;/p&gt;
&lt;p&gt;[infobox title=&quot;Tips&quot;]&lt;/p&gt;
&lt;p&gt;Rocky Linux 9 DVD has installed SSHD server by default.&lt;/p&gt;
&lt;p&gt;[/infobox]&lt;/p&gt;
&lt;p&gt;Typically, we should use &lt;code&gt;ssh-agent&lt;/code&gt; for better key management and security, but since this is a base image and we just want password-less access from our local Mac to the new VMs, it&apos;s simpler to prepare the &lt;code&gt;authorized_keys&lt;/code&gt; file. This way, we can quickly enable password-less authentication without dealing with additional setup or dependencies! That&apos;s what I will use in this kubernetes tutorial!&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Perform the steps on local machine (mine is the macbook pro) :&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;ssh-keygen -t rsa
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Just follow the steps, using default settings, you will get your key pairs at &lt;code&gt;~/.ssh/id_rsa.pub&lt;/code&gt; and&lt;code&gt;~/.ssh/id_rsa&lt;/code&gt;. Save the output of content of &lt;code&gt;~/.ssh/id_rsa.pub&lt;/code&gt; by cating it:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;cat ~/.ssh/id_rsa.pub
&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Log into baseimage as root to perform:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;mkdir -p /root/.ssh/
touch /root/.ssh/authorized_keys
chmod 600 /root/.ssh/authorized_keys
vi /root/.ssh/authorized_keys
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In above vi editor, paste your content of&lt;code&gt;~/.ssh/id_rsa.pub&lt;/code&gt; and save it. Repeat above steps for creating &lt;code&gt;.ssh&lt;/code&gt; folder and populate the &lt;code&gt;/home/admin/.ssh/authorized_keys&lt;/code&gt; for &lt;code&gt;admin&lt;/code&gt; account. Then restart sshd on base VM:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;systemctl restart sshd
&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Test login to the base VM from your local machine, you will not need to type password:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;ssh -vv root@172.16.211.3
ssh -vv admin@172.16.211.3
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Using &lt;code&gt;-vv&lt;/code&gt; at this moment is useful, because most likely your first time setup ssh passwordless authentication would fail due to this or that misconfiguration. With &lt;code&gt;-vv&lt;/code&gt; you can spot the error message. Good luck!&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;hr /&gt;
&lt;h2&gt;&lt;strong&gt;Set Up Essential Tools&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;Create a shared tools directory:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;mkdir -p /opt/share_tools/bin/
chmod 755 -R /opt/share_tools/
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Verify directory permissions:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;ls -l /opt | grep share
ls -l /opt/share_tools
&lt;/code&gt;&lt;/pre&gt;
&lt;hr /&gt;
&lt;h3&gt;Setup Zsh as the Shared Default Shell&lt;/h3&gt;
&lt;p&gt;Zsh is the first basic thing I want to cover in this Kubernetes tutorial!&lt;/p&gt;
&lt;p&gt;Zsh is the default shell on Mac. I want to have it on the Rocky Linux VM as well.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Install Zsh and Packages (Ensure Zsh is Installed)&lt;/strong&gt;&lt;br /&gt;
Zsh should already be installed in the &quot;&lt;a href=&quot;/?p=4857&amp;amp;preview=true#Update_Packages&quot;&gt;Update OS and Install Packages&lt;/a&gt;&quot; section. This guide is best viewed on &lt;a href=&quot;/&quot;&gt;GeekCoding101&lt;/a&gt;—where it was originally published&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Install Oh-My-Zsh&lt;/strong&gt;&lt;br /&gt;
Run the following command to install Oh-My-Zsh:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;wget https://github.com/robbyrussell/oh-my-zsh/raw/master/tools/install.sh -O - | zsh
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;By default, Oh-My-Zsh is installed in your user’s home directory (&lt;code&gt;~/.oh-my-zsh&lt;/code&gt;).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Copy the Checked Out Folder to a Shared Path&lt;/strong&gt;&lt;br /&gt;
Copy the Oh-My-Zsh directory to a shared path:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;cp -r ~/.oh-my-zsh /usr/share/oh-my-zsh
&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Install Powerlevel10k to a Shared Path&lt;/strong&gt;&lt;br /&gt;
Clone the Powerlevel10k theme into a shared location:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;git clone --depth=1 https://github.com/romkatv/powerlevel10k.git /usr/share/powerlevel10k
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You needs to have Patched font to display the icons on shell console. Recommended font can be downloaded at here: &lt;a href=&quot;https://github.com/ryanoasis/nerd-fonts/tree/master/patched-fonts/Meslo/M/Regular&quot;&gt;Meslo Nerd Font patched for Powerlevel10k&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I am using iTerm2 , so configure the font for the profile as below: &lt;img src=&quot;./iterm2-font-settings.webp&quot; alt=&quot;Kubernetes tutorial part 1: iterm2 font settings&quot; title=&quot;Kubernetes tutorial part 1: iterm2 font settings&quot; /&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Update &lt;code&gt;~/.zshrc&lt;/code&gt;&lt;/strong&gt;&lt;br /&gt;
Modify your &lt;code&gt;.zshrc&lt;/code&gt; to use the shared paths (Here I used &lt;code&gt;~/.zshrc&lt;/code&gt;, but we don&apos;t need this for every user, because later we will dump the content of &lt;code&gt;~/.zshrc&lt;/code&gt; to &lt;code&gt;/etc/zshrc&lt;/code&gt; for all users):&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;export ZSH=&quot;/usr/share/oh-my-zsh&quot; 
ZSH_THEME=&quot;powerlevel10k/powerlevel10k&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Copy Configured Files to &lt;code&gt;/etc/skel&lt;/code&gt; for New Users&lt;/strong&gt;&lt;br /&gt;
Once the configuration is complete, copy the necessary files to the &lt;code&gt;/etc/skel&lt;/code&gt; directory for new users:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;cp ~/.zshrc /etc/skel/ 
cp ~/.p10k.zsh /etc/skel/ 
chmod 644 /etc/skel/.zshrc /etc/skel/.p10k.zsh
&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Set Default Shell for New Users&lt;/strong&gt;&lt;br /&gt;
Update &lt;code&gt;/etc/default/useradd&lt;/code&gt; to set Zsh as the default shell for new users:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;SHELL=/bin/zsh
&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h4&gt;Modify &lt;code&gt;/etc/zshrc&lt;/code&gt; for Shared Configuration&lt;/h4&gt;
&lt;p&gt;At the end of &lt;code&gt;/etc/zshrc&lt;/code&gt;, add the following to handle SSH sessions:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# Check if this is an SSH session 
# If not, launch bash because console fonts couldn&apos;t support oh-my-zsh 
if [[ ! -n &quot;$SSH_CONNECTION&quot; ]]; then 
  exec /bin/bash
fi
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./launch-bash-if-not-ssh.webp&quot; alt=&quot;Kubernetes tutorial part 1: launch bash if not ssh&quot; title=&quot;Kubernetes tutorial part 1: launch bash if not ssh&quot; /&gt;&lt;/p&gt;
&lt;p&gt;You might notice above screenshot has a very nice status bar in Vim, let me know in comments if you want to know how I customized my Vim ^^ (This guide is best viewed on &lt;a href=&quot;/&quot;&gt;GeekCoding101&lt;/a&gt;—where it was originally published. 🚀) Append the &lt;code&gt;.zshrc&lt;/code&gt; configuration into &lt;code&gt;/etc/zshrc&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;cat .zshrc &amp;gt;&amp;gt; /etc/zshrc
&lt;/code&gt;&lt;/pre&gt;
&lt;h4&gt;Set Up Global Aliases in &lt;code&gt;/etc/zshenv&lt;/code&gt;&lt;/h4&gt;
&lt;p&gt;Populate &lt;code&gt;/etc/zshenv&lt;/code&gt; with the my favorite aliases:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;alias ls=&apos;ls -G&apos;
alias ll=&apos;ls -G -l&apos;
alias la=&apos;ls -G -la&apos;
# Git
alias gs=&apos;git status &apos;
alias ga=&apos;git add &apos;
alias gb=&apos;git branch &apos;
alias gba=&apos;git branch -a&apos;
alias gbd=&apos;git branch -d&apos;
alias gbr=&apos;git branch -r&apos;
alias gc=&apos;git commit &apos;
alias gd=&apos;git diff &apos;
alias gdh=&apos;git diff HEAD &apos;
alias gco=&apos;git checkout &apos;
alias glg=&apos;git log  --graph   --name-only &apos;
# Get pods
alias k=kubectl
alias kg=&apos;kubectl get&apos;
alias kga=&apos;kubectl get all --all-namespaces&apos;
alias kgns=&quot;kubectl get ns --show-labels&quot;
alias kgp=&quot;kubectl get pods -o wide&quot;
alias kgpn=&quot;kubectl get pods -o wide -n &quot;
alias kgpa=&quot;kubectl get pods -A -o wide&quot;
alias kgpjson=&apos;kubectl get pods -o=json&apos;                 # options: -n &amp;lt;ns&amp;gt; &amp;lt;pn&amp;gt;
alias kgpsys=&apos;kubectl --namespace=kube-system get pods&apos;
alias kgs=&quot;kubectl get service -o wide&quot;
alias kgsn=&quot;kubectl get service -o wide -n&quot;
alias kgn=&quot;kubectl get nodes -o wide&quot;

# Describe
alias k=kubectl
alias kdns=&apos;kubectl describe namespace&apos;
alias kdn=&apos;kubectl describe node&apos;
alias kdpn=&quot;kubectl describe pod -n&quot;            # options: -n &amp;lt;ns&amp;gt; &amp;lt;pn&amp;gt;

# Delete
alias krm=&apos;kubectl delete&apos;
alias krmf=&apos;kubectl delete -f&apos;
alias krming=&apos;kubectl delete ingress&apos;
alias krmingl=&apos;kubectl delete ingress -l&apos;
alias krmingall=&apos;kubectl delete ingress --all-namespaces&apos;
# Misc
alias ka=&apos;kubectl apply -f&apos;
alias klo=&apos;kubectl logs -f&apos;
alias kex=&apos;kubectl exec -i -t&apos;

export GPG_TTY=$(tty)
export SHARE_TOOLS=&quot;/opt/share_tools/bin/&quot;
export PATH=${SHARE_TOOLS}:$PATH

&lt;/code&gt;&lt;/pre&gt;
&lt;h4&gt;Update &lt;code&gt;zsh&lt;/code&gt; For Existing Users&lt;/h4&gt;
&lt;p&gt;I know this is our first VM, but just in case you want to configure on your existing VM, for users created before setting up &lt;code&gt;Oh-My-Zsh&lt;/code&gt; and &lt;code&gt;Powerlevel10k&lt;/code&gt;, update their shell to Zsh (replace the ${targetuser} with your real username):&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;chsh -s /bin/zsh ${targetuser}
&lt;/code&gt;&lt;/pre&gt;
&lt;h4&gt;Maintenance&lt;/h4&gt;
&lt;p&gt;For future maintenance purpose, we only need to update the following files:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;/etc/zshenv&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;/etc/zshrc&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;code&gt;Oh-My-Zsh&lt;/code&gt; will check update and prompt when everytime you login. So no need worry there!&lt;/p&gt;
&lt;p&gt;:::info&lt;/p&gt;
&lt;p&gt;Thanks for your reading! So far so good? I hope you enjoy my kubernetes tutorial! If any feedback, feel free to leave your comments!&lt;/p&gt;
&lt;p&gt;:::&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;Configure Tmux for Multi-Session Management&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Ever had SSH sessions drop in the middle of a deployment? Or needed to juggle multiple terminals like a hacker in a sci-fi movie? &lt;code&gt;tmux&lt;/code&gt; solves it all. With persistent sessions, split panes, and the ability to detach and reattach at will, I can effortlessly manage multiple Kubernetes nodes, tail logs, and run long processes without worrying about losing my progress. It’s basically my &lt;strong&gt;command-line command center&lt;/strong&gt;, a friend of Kubernetes cluster administrator, and once you get hooked, there’s no going back. 🚀 This is another must for any development environment! Let me show you the tricks in this kubernetes tutorial!&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;Tmux Launcher Script&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;I want to luanch Tmux automatically when SSH into the VM, it needs a script to launch it and hook it into zsh launch.&lt;/p&gt;
&lt;p&gt;Create a script for launching tmux:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;vim /opt/share_tools/bin/launch_tmux.sh
chmod +x /opt/share_tools/bin/launch_tmux.sh
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Script contents:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;#!/bin/zsh

SESSION_NAME=&quot;k8s&quot;

if [ -z &quot;$TMUX&quot; ]; then
  tmux has-session -t ${SESSION_NAME} 2&amp;gt;/dev/null

  if [[ $? != 0 ]]; then
    tmux new-session -s ${SESSION_NAME}
  else
    tmux ls | grep -q &quot;${SESSION_NAME}:.*(attached)&quot;
    if [[ $? == 0 ]]; then
      tmux new-session
    else
      tmux attach -t ${SESSION_NAME}
    fi
  fi
else
  echo &quot;Tmux session $SESSION_NAME already exists.&quot;
fi

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Append below into &lt;code&gt;/etc/zshrc&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# Launch tmux
# Check if the user is connected via SSH
if [[ -n &quot;$SSH_CONNECTION&quot; ]]; then
  # Launch the tmux script
  /opt/share_tools/bin/launch_tmux.sh
fi
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./launch-tmux-if-ssh.webp&quot; alt=&quot;Kubernetes tutorial part 1: launch tmux if it is ssh&quot; title=&quot;Kubernetes tutorial part 1: launch tmux if it is ssh&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Now open new ssh will see (default tmux + oh-my-zsh + powerlevel10k):&lt;/p&gt;
&lt;h4&gt;&lt;img src=&quot;./tmux-initial-show.png&quot; alt=&quot;Kubernetes tutorial part 1: tmux initial show screenshot&quot; title=&quot;Kubernetes tutorial part 1: tmux initial show screenshot&quot; /&gt;&lt;/h4&gt;
&lt;h4&gt;&lt;strong&gt;Install and Configure &lt;a href=&quot;https://github.com/gpakosz/.tmux.git&quot;&gt;gpakosz/.tmux.git&lt;/a&gt; for Tmux&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;The UI of above TMUX was too plain?! Not cool !&lt;/p&gt;
&lt;p&gt;Okay, let&apos;s customize it a little bit with &lt;code&gt;gpakosz/.tmux.git&lt;/code&gt;!&lt;/p&gt;
&lt;p&gt;Log into the base VM as root to perform (This guide is best viewed on &lt;a href=&quot;/&quot;&gt;GeekCoding101&lt;/a&gt;—where it was originally published):&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;git clone https://github.com/gpakosz/.tmux.git /opt/gpakosz.tmux/
ln -s /opt/gpakosz.tmux/.tmux.conf /etc/tmux.conf
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Set below in &lt;code&gt;/etc/zshenv&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# For make tmux for all users on the VM, must define TMUX_CONF to /etc/tmux.conf.
# It won&apos;t work if set TMUX_CONF to other values, like &quot;/opt/gpakosz.tmux/.tmux.conf&quot;
export TMUX_CONF=&quot;/etc/tmux.conf&quot;
export TMUX_CONF_LOCAL=&quot;/opt/gpakosz.tmux/.tmux.conf.local&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Create the file link:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;ln -s /opt/gpakosz.tmux/.tmux.conf /etc/tmux.conf
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Ensure &lt;strong&gt;&lt;code&gt;perl-Time-HiRes&lt;/code&gt;&lt;/strong&gt; is installed at the &lt;a href=&quot;/?p=4857&amp;amp;preview=true#Update_Packages&quot;&gt;Update_Packages step&lt;/a&gt;.&lt;/p&gt;
&lt;h4&gt;Customize the TMUX theme&lt;/h4&gt;
&lt;p&gt;Append below into &lt;code&gt;/opt/gpakosz.tmux/.tmux.conf.local&lt;/code&gt; before the line &lt;code&gt;# -- custom variables&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# increase history size
set -g history-limit 9999999
# start with mouse mode enabled
set -g mouse on

bind-key -n C-S-Left swap-window -t -1\; select-window -t -1
bind-key -n C-S-Right swap-window -t +1\; select-window -t +1

# -- custom variables ----------------------------------------------------------
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Still in &lt;code&gt;/opt/gpakosz.tmux/.tmux.conf.local&lt;/code&gt;, find &lt;code&gt;mode-keys vi&lt;/code&gt; and uncomment it:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./enable-vi-in-tmux.webp&quot; alt=&quot;enable vi in tmux&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Continue customization in&lt;code&gt;/opt/gpakosz.tmux/.tmux.conf.local&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;I&apos;d like to change some color, just follow me find below settings and update as below:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;tmux_conf_theme_left_separator_main=&apos;\uE0B0&apos;  # /!\ you don&apos;t need to install Powerline
tmux_conf_theme_left_separator_sub=&apos;\uE0B1&apos;   # you only need fonts patched with
tmux_conf_theme_right_separator_main=&apos;\uE0B2&apos; # Powerline symbols or the standalone
tmux_conf_theme_right_separator_sub=&apos;\uE0B3&apos;  # PowerlineSymbols.otf font, see README.md

tmux_conf_theme_status_left=&quot; ☮️ #S | &quot;

# status right style
tmux_conf_theme_status_right_fg=&quot;$tmux_conf_theme_colour_12,$tmux_conf_theme_colour_14,$tmux_conf_theme_colour_6&quot;
tmux_conf_theme_status_right_bg=&quot;$tmux_conf_theme_colour_15,$tmux_conf_theme_colour_17,$tmux_conf_theme_colour_9&quot;

tmux_conf_theme_left_separator_main=&apos;\uE0B0&apos;
tmux_conf_theme_left_separator_sub=&apos;\uE0B1&apos;
tmux_conf_theme_right_separator_main=&apos;\uE0B2&apos;
tmux_conf_theme_right_separator_sub=&apos;\uE0B3&apos;

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now take a look (I just used the localserver VM we will create later to take a screenshot, the top right &quot;localserver&quot; is set by &lt;a href=&quot;https://iterm2.com/&quot;&gt;iTerm2&lt;/a&gt;) !&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./tmux-with-oh-my-zsh.webp&quot; alt=&quot;Kubernetes tutorial part 1: tmux with oh my zsh&quot; title=&quot;Kubernetes tutorial part 1: tmux with oh my zsh&quot; /&gt;&lt;/p&gt;
&lt;p&gt;:::info&lt;/p&gt;
&lt;p&gt;Do you feel boring so far? I hope not! If any feedback about this kubernetes tutorial, looking forward to seeing your comments!&lt;/p&gt;
&lt;p&gt;:::&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;Install Miniforge&lt;/h3&gt;
&lt;p&gt;I haven&apos;t thought about what exact use case I need Python in this Kubernetes environment, but I want to have Python management toolkit ready on the base image so that it can become handy in future. Let&apos;s cover this in thisKubernetes tutorial as well!&lt;/p&gt;
&lt;p&gt;In my development environment, e.g. this Kubernetes cluster environment, I prefer &lt;strong&gt;Miniforge&lt;/strong&gt; over &lt;strong&gt;Conda&lt;/strong&gt; to manage Python, because -- why deal with the bloated, corporate-flavored Anaconda distribution when you can have a &lt;strong&gt;lightweight, community-driven alternative&lt;/strong&gt; that just works? 🚀 Miniforge gives you the &lt;strong&gt;same Conda package management power&lt;/strong&gt;, but &lt;strong&gt;without the unnecessary packages&lt;/strong&gt;, keeping it &lt;strong&gt;fast and minimal&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The installation is simple.&lt;/p&gt;
&lt;p&gt;Run &lt;code&gt;curl&lt;/code&gt; command to download from &lt;a href=&quot;https://conda-forge.org/download/&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Then install it to &lt;code&gt;/opt/miniforge3&lt;/code&gt; so that every user can use it:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;curl -L -O &quot;https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh&quot;
chmod +x Miniforge3-Linux-x86_64.sh

❯ ./Miniforge3-Linux-x86_64.sh -h

usage: ./Miniforge3-Linux-x86_64.sh [options]

Installs Miniforge3 24.11.3-0
-b           run install in batch mode (without manual intervention),
             it is expected the license terms (if any) are agreed upon
-f           no error if install prefix already exists
-h           print this help message and exit
-p PREFIX    install prefix, defaults to /root/miniforge3, must not contain spaces.
-s           skip running pre/post-link/install scripts
-u           update an existing installation
-t           run package tests after installation (may install conda-build)

❯ ./Miniforge3-Linux-x86_64.sh -p /opt/miniforge3
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;During the installation, it also asked me to update Shell, I answered &lt;code&gt;yes&lt;/code&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./miniforge-installation.webp&quot; alt=&quot;miniforge installation&quot; /&gt;&lt;/p&gt;
&lt;p&gt;After installation, I noticed that &lt;code&gt;/etc/zshrc&lt;/code&gt; got updated as below:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# &amp;gt;&amp;gt;&amp;gt; conda initialize &amp;gt;&amp;gt;&amp;gt;
# !! Contents within this block are managed by &apos;conda init&apos; !!
__conda_setup=&quot;$(&apos;/opt/miniforge3/bin/conda&apos; &apos;shell.zsh&apos; &apos;hook&apos; 2&amp;gt; /dev/null)&quot;
if [ $? -eq 0 ]; then
    eval &quot;$__conda_setup&quot;
else
    if [ -f &quot;/opt/miniforge3/etc/profile.d/conda.sh&quot; ]; then
        . &quot;/opt/miniforge3/etc/profile.d/conda.sh&quot;
    else
        export PATH=&quot;/opt/miniforge3/bin:$PATH&quot;
    fi
fi
unset __conda_setup

if [ -f &quot;/opt/miniforge3/etc/profile.d/mamba.sh&quot; ]; then
    . &quot;/opt/miniforge3/etc/profile.d/mamba.sh&quot;
fi
# &amp;lt;&amp;lt;&amp;lt; conda initialize &amp;lt;&amp;lt;&amp;lt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can see it set &lt;code&gt;PATH&lt;/code&gt; in above, but just to be safe, in order to find programs under &lt;code&gt;/opt/miniforge3/bin&lt;/code&gt;, I also manually updated my &lt;code&gt;/etc/zshenv&lt;/code&gt; as below:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;export MINIFORGE=&quot;/opt/miniforge3/bin&quot;
export PATH=${MINIFORGE}:${SHARE_TOOLS}:$PATH
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let&apos;s run a test:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ conda env list

# conda environments:
#
base                 * /opt/miniforge3

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;:::info&lt;/p&gt;
&lt;p&gt;Do you like my kubernetes tutorial so far? Rate it a 5 star!&lt;/p&gt;
&lt;p&gt;:::&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;&lt;strong&gt;Install and Configure Ansible&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Okay, now it&apos;s time to install Ansible in this Kubernetes tutorial. Let&apos;s use Ansible to manage the operations in Kubernetes nodes.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;dnf install epel-release -y
dnf install ansible -y
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Generate a default configuration file:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;ansible-config init --disabled &amp;gt; /etc/ansible/ansible.cfg
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Update &lt;code&gt;/etc/zshenv&lt;/code&gt; to append below line:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;export ANSIBLE_CONFIG=/etc/ansible/ansible.cfg
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Source /etc/zshenv:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;source /etc/zshenv
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Update &lt;code&gt;/etc/ansible/ansible.cfg&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[defaults]
inventory = /etc/ansible/hosts
log_path = /var/log/ansible.log
host_key_checking = False
retry_files_enabled = False
timeout = 10
display_skipped_hosts = False
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Verify installation:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-4.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;[warningbox title=&quot;Tips:&quot;]&lt;/p&gt;
&lt;p&gt;If not add export line in &lt;code&gt;/etc/zshenv&lt;/code&gt; and source it,  then ansible --version will use &lt;code&gt;/root/ansible.cfg&lt;/code&gt;, like this:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-6.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;[/warningbox]&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;Configure Ansible Hosts&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Edit &lt;code&gt;/etc/ansible/hosts&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[base]
baseimage ansible_host=172.16.211.3 ansible_user=root ansible_ssh_private_key_file=~/.ssh/ansible_ed25519

[application_servers]
localserver ansible_host=172.16.211.100 ansible_user=root ansible_ssh_private_key_file=~/.ssh/ansible_ed25519

[k8s_cluster]
k8s-1 ansible_host=172.16.211.11 ansible_user=root ansible_ssh_private_key_file=~/.ssh/ansible_ed25519
k8s-2 ansible_host=172.16.211.12 ansible_user=root ansible_ssh_private_key_file=~/.ssh/ansible_ed25519
k8s-3 ansible_host=172.16.211.13 ansible_user=root ansible_ssh_private_key_file=~/.ssh/ansible_ed25519
k8s-4 ansible_host=172.16.211.14 ansible_user=root ansible_ssh_private_key_file=~/.ssh/ansible_ed25519

&lt;/code&gt;&lt;/pre&gt;
&lt;h4&gt;&lt;strong&gt;SSH Key Setup for Ansible&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Let&apos;s generate a new key pair for Ansible purpose, also for easy maintenance and isolation:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;ssh-keygen -t ed25519 -C &quot;ansible-key&quot; -f ~/.ssh/ansible_ed25519
ssh-copy-id -i ~/.ssh/ansible_ed25519.pub root@172.16.211.3
&lt;/code&gt;&lt;/pre&gt;
&lt;h4&gt;Test SSH access&lt;/h4&gt;
&lt;pre&gt;&lt;code&gt;ssh -i ~/.ssh/ansible_ed25519 root@172.16.211.3
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Run a quick Ansible test:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;ansible baseimage -m ping
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Example output:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-7.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;[warningbox title=&quot;Tips&quot;]&lt;/p&gt;
&lt;h4&gt;Tips: Why do we need ansible_ssh_private_key_file in&lt;code&gt;/etc/ansible/hosts&lt;/code&gt; ?&lt;/h4&gt;
&lt;p&gt;If without it, you might see below output in ping test:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./image-9.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;[/warningbox]&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;[infobox title=&quot;Preview of next post&quot;]&lt;/p&gt;
&lt;p&gt;I have written another Ansible script to sync specific account&apos;s ssh key to target machine!&lt;/p&gt;
&lt;p&gt;You will see it soon in coming kubernetes tutorial!&lt;/p&gt;
&lt;p&gt;[/infobox]&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Create configure_vm.yml Script&lt;/h2&gt;
&lt;p&gt;Think about it—cloning the base image is easy, but manually setting the &lt;strong&gt;hostname, network, and other configs&lt;/strong&gt; for every VM? &lt;strong&gt;No thanks!&lt;/strong&gt; That’s way too much repetitive work. 😵‍💫 I can&apos;t tolerate such cumbersome in my kubernetes tutorial!&lt;/p&gt;
&lt;p&gt;So, being the efficiency-loving geek that I am, I wrote a script at:&lt;br /&gt;
📌 &lt;strong&gt;&lt;code&gt;/opt/share_tools/bin/configure_vm.yml&lt;/code&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;With this, after cloning this base image for our Kubertenets cluster setup, I can just feed in an input file, run the script, and &lt;strong&gt;boom&lt;/strong&gt;—it automatically configures each VM with the right settings. Less typing, fewer mistakes, and more time for the fun stuff. Let’s put this script to work! 🚀&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;---
- hosts: localhost
  gather_facts: no
  vars:
    input_file: &quot;{{ input_file_path | default(&apos;input.json&apos;) }}&quot;
    config: &quot;{{ lookup(&apos;file&apos;, input_file) | from_json }}&quot;
    ansible_key_path: &quot;{{ config.ansible_key_path | default(&apos;~/.ssh/ansible_ed25519&apos;) }}&quot;
    ssh_key_path: &quot;{{ config.ssh_key_path | default(&apos;~/.ssh/ssh_ed25519&apos;) }}&quot;

  tasks:
    # Handle Ansible SSH Key
    - name: Check if Ansible SSH private key exists
      stat:
        path: &quot;{{ ansible_key_path }}&quot;
      register: ansible_key_exists

    - name: Remove existing Ansible SSH private key if present
      file:
        path: &quot;{{ ansible_key_path }}&quot;
        state: absent
      when: ansible_key_exists.stat.exists

    - name: Remove existing Ansible SSH public key if present
      file:
        path: &quot;{{ ansible_key_path }}.pub&quot;
        state: absent
      when: ansible_key_exists.stat.exists

    - name: Generate Ansible SSH key pair
      ansible.builtin.openssh_keypair:
        path: &quot;{{ ansible_key_path }}&quot;
        type: ed25519
        state: present
        comment: &quot;ansible@{{ config.hostname }}&quot;

    # Handle SSH Connection Key
    - name: Check if SSH private key exists
      stat:
        path: &quot;{{ ssh_key_path }}&quot;
      register: ssh_key_exists

    - name: Remove existing SSH private key if present
      file:
        path: &quot;{{ ssh_key_path }}&quot;
        state: absent
      when: ssh_key_exists.stat.exists

    - name: Remove existing SSH public key if present
      file:
        path: &quot;{{ ssh_key_path }}.pub&quot;
        state: absent
      when: ssh_key_exists.stat.exists

    - name: Generate SSH key pair for SSH connection
      ansible.builtin.openssh_keypair:
        path: &quot;{{ ssh_key_path }}&quot;
        type: ed25519
        state: present
        comment: &quot;ssh@{{ config.hostname }}&quot;

    - name: Debug the resolved SSH key paths for verification
      debug:
        msg: |
          The Ansible SSH key path is {{ ansible_key_path }}
          The SSH connection key path is {{ ssh_key_path }}

    # Network and Hostname Configuration
    - name: Set IP address and gateway using nmcli
      command: &quot;nmcli con mod ens160 ipv4.addresses {{ config.ip }}/{{ config.subnet }} ipv4.gateway {{ config.gateway }} ipv4.dns &apos;{{ config.dns1 }} {{ config.dns2 }}&apos; ipv4.method manual&quot;
      ignore_errors: yes

    - name: Bring up the connection
      command: nmcli con up ens160
      ignore_errors: yes

    - name: Set the hostname
      command: hostnamectl set-hostname &quot;{{ config.hostname }}&quot;

    - name: Update /etc/hosts - remove baseimage
      lineinfile:
        path: /etc/hosts
        regexp: &apos;baseimage&apos;
        state: absent

    - name: Update /etc/hosts - add new hostname
      lineinfile:
        path: /etc/hosts
        line: &quot;{{ config.ip }} {{ config.hostname }}.{{ config.domain }} {{ config.hostname }}&quot;
        state: present

    - name: Update /etc/zshenv to set ANSIBLE_CONFIG environment variable
      lineinfile:
        path: /etc/zshenv
        line: &quot;export ANSIBLE_CONFIG=/etc/ansible/ansible.cfg&quot;
        create: yes

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This one must be the longest script in current Kubernetes tutorial post!&lt;/p&gt;
&lt;p&gt;It&apos;s actually simple. By the way, I will always show the complete code in my kubernetes tutorial, no worry missing any code. If you spot any, comment it immediately to let me know!&lt;/p&gt;
&lt;p&gt;It configures the VM by setting up SSH keys, network settings, hostname, and environment variables. It reads configuration details from a JSON input file (&lt;code&gt;input.json&lt;/code&gt; by default) and applies the following steps:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1. SSH Key Management&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Ensures that both &lt;strong&gt;Ansible SSH keys (Remember I generated a different key pair for ansible purpose)&lt;/strong&gt; and &lt;strong&gt;regular SSH keys&lt;/strong&gt; are properly configured:
&lt;ul&gt;
&lt;li&gt;Removes existing keys if they are present, that&apos;s the ones came from base image.&lt;/li&gt;
&lt;li&gt;Generates new &lt;strong&gt;Ed25519&lt;/strong&gt; SSH key pairs for &lt;strong&gt;Ansible automation&lt;/strong&gt; and &lt;strong&gt;regular SSH access&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;2. Network &amp;amp; Hostname Configuration&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Configures the machine&apos;s &lt;strong&gt;IP address, gateway, and DNS&lt;/strong&gt; using &lt;code&gt;nmcli&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Brings up the modified network connection.&lt;/li&gt;
&lt;li&gt;Sets the machine&apos;s &lt;strong&gt;hostname&lt;/strong&gt; using &lt;code&gt;hostnamectl&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Updates &lt;code&gt;/etc/hosts&lt;/code&gt;:
&lt;ul&gt;
&lt;li&gt;Removes any references to &lt;code&gt;baseimage&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Adds a new entry for the machine&apos;s IP and domain.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;3. Environment Variable Setup&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Ensures that the &lt;strong&gt;Ansible configuration path&lt;/strong&gt; is set in &lt;code&gt;/etc/zshenv&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;[warningbox title=&quot;Warning&quot;]&lt;/p&gt;
&lt;p&gt;Remember the &lt;a href=&quot;/devops/kubernetes/ultimate-setup-tutorial-part1/#Tips_Network_Interface_Names&quot;&gt;Newwork Interface Name&lt;/a&gt;? You need to update &lt;code&gt;ens160&lt;/code&gt; in above script to your network interface name!&lt;/p&gt;
&lt;p&gt;My bad! I should have parameterize it for the script!&lt;/p&gt;
&lt;p&gt;[/warningbox]&lt;/p&gt;
&lt;p&gt;This script is designed for our initial VM provisioning, ensuring SSH access, correct network configuration, and proper hostname resolution. It really makes our Kubernetes cluster setup easier!&lt;/p&gt;
&lt;p&gt;It&apos;s fantastic!&lt;/p&gt;
&lt;h2&gt;Base VM/Image of Kubernetes is done! Clean Up!&lt;/h2&gt;
&lt;p&gt;So now our base VM for Kubernetes is ready. You think that&apos;s the end of this Kubernetes tutorial?! No way! It&apos;s just a start!&lt;/p&gt;
&lt;p&gt;Since we will clone it to new VMs, let&apos;s clean up the logs and stale configuration.&lt;/p&gt;
&lt;p&gt;I created below script &lt;code&gt;/opt/share_tools/bin/clean_up.sh&lt;/code&gt; to do the clean up job!&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;#!/bin/bash

echo &quot;Starting system cleanup...&quot;

# Remove all non-builtin users except &apos;admin&apos;, &apos;nobody&apos; and reserved users
USERS=$(awk -F: &apos;($3 &amp;gt;= 1000 &amp;amp;&amp;amp; $1 != &quot;admin&quot; &amp;amp;&amp;amp; $1 != &quot;nobody&quot;) {print $1}&apos; /etc/passwd)
for USER in $USERS; do
    echo &quot;Deleting user: $USER&quot;
    userdel -r $USER
done

# Clean up system logs and temporary files
log_dirs=(
    &quot;/var/log&quot;
    &quot;/var/tmp&quot;
    &quot;/tmp&quot;
)

# Find and delete log and temp files, and print deleted files
for dir in &quot;${log_dirs[@]}&quot;; do
    echo &quot;Cleaning directory: $dir&quot;
    find &quot;$dir&quot; -type f -name &quot;*.log&quot; -print -exec rm -f {} \;
    find &quot;$dir&quot; -type f -name &quot;*.tmp&quot; -print -exec rm -f {} \;
done

echo &quot;Cleaning up package manager cache...&quot;
dnf clean all

echo &quot;Rotating and cleaning journal logs...&quot;
journalctl --rotate
journalctl --vacuum-time=1s

# Remove all non-hidden files under /root except anaconda-ks.cfg
echo &quot;Keeping anaconda-ks.cfg and removing other non-hidden files under /root...&quot;
find /root/ -maxdepth 1 -type f ! -name &quot;anaconda-ks.cfg&quot; -not -name &quot;.*&quot; -print -exec rm -f {} \;
rm -frv /root/.cache 
echo &quot;&quot; &amp;gt; /root/.zsh_history

# Remove all non-hidden files under /home/admin/
echo &quot;Removing all non-hidden files under /home/admin/...&quot;
find /home/admin/ -maxdepth 1 -type f -not -path &apos;*/\.*&apos; -print -exec rm -f {} \;

# Clean up command history
&amp;gt; /home/admin/.bash_history
&amp;gt; /home/admin/.zsh_history
&amp;gt; /root/.bash_history
&amp;gt; /root/.zsh_history
echo &quot;System cleanup complete.&quot;

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Just run it once before we shutdown this base VM:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;/opt/share_tools/bin/clean_up.sh
&lt;/code&gt;&lt;/pre&gt;
&lt;hr /&gt;
&lt;h1&gt;Hooray!&lt;/h1&gt;
&lt;p&gt;Spent several days crafting this &lt;strong&gt;Part 1&lt;/strong&gt; post for my kubernetes tutorial — because if I’m doing this, I’m doing it right. My mission? To deliver the &lt;strong&gt;best damn Kubernetes cluster setup tutorial&lt;/strong&gt; on the internet! 🚀&lt;/p&gt;
&lt;p&gt;Up next, in mykubernetes tutorial &lt;strong&gt;Part 2&lt;/strong&gt;, I’ll walk you through setting up a &lt;strong&gt;localserver&lt;/strong&gt; to handle &lt;strong&gt;DNS and NTP services&lt;/strong&gt; within our Kubernetes cluster environment, laying the foundation for a &lt;strong&gt;fully functional Kubernetes cluster&lt;/strong&gt;. With some luck (and zero typos in config files), we’ll have our nodes talking to each other in no time. Stay tuned! 😎&lt;/p&gt;
&lt;p&gt;:::info
Love my kubernetes tutorial? Rate it a 5 start !
:::&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;:::info
You&apos;re on a roll! Don&apos;t stop now—check out the full series and level up your Kubernetes skills. Each post builds on the last, so make sure you haven’t missed anything! 👇&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/kubernetes-tutorial-part1&quot;&gt;Part 1&lt;/a&gt;&lt;/strong&gt;, current one.&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/tutorial-part2-dns-server-ntp&quot;&gt;Part 2&lt;/a&gt;&lt;/strong&gt;, I walked through &lt;strong&gt;configuring a local DNS server and NTP server&lt;/strong&gt;, essential for stable name resolution and time synchronization across nodes locally. These foundational steps will make our Kubernetes setup smoother&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/part3-kubernetes-cluster-setup&quot;&gt;Part 3&lt;/a&gt;&lt;/strong&gt;, I finished the Kubernetes cluster setup with Flannel, got one Kubernetes master and 4 worker nodes that’s ready for real workloads.&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/part3-kubernetes-cluster-setup&quot;&gt;Part 4&lt;/a&gt;&lt;/strong&gt;, I explored NodePort and ClusterIP,understood the key differences, use cases, and when to choose each for internal and external service access!🔥&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/externalname-loadbalancer-5&quot;&gt;Part 5&lt;/a&gt;&lt;/strong&gt;, explored how to use externalName and LoadBalancer and how to run load testing with tool &lt;code&gt;hey&lt;/code&gt;.
:::&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Ultimate Kubernetes Tutorial Part 2: DNS server and NTP server Configuration</title><link>https://geekcoding101.com/posts/tutorial-part2-dns-server-ntp</link><guid isPermaLink="true">https://geekcoding101.com/posts/tutorial-part2-dns-server-ntp</guid><pubDate>Mon, 03 Mar 2025 00:00:00 GMT</pubDate><content:encoded>&lt;h1&gt;Introduction&lt;/h1&gt;
&lt;p&gt;Hey there! Ready to take this Kubernetes setup to the next level? 🚀 In &lt;a href=&quot;/devops/kubernetes/ultimate-setup-tutorial-part1/&quot;&gt;&lt;strong&gt;Part 1&lt;/strong&gt;&lt;/a&gt;, we got our &lt;strong&gt;base VM image&lt;/strong&gt; up and running—nice work! Now, in &lt;strong&gt;Part 2&lt;/strong&gt;, I am going to clone that image to set up a &lt;strong&gt;local server&lt;/strong&gt; as a &lt;strong&gt;&lt;a href=&quot;https://www.cloudflare.com/learning/dns/what-is-a-dns-server/&quot;&gt;DNS server&lt;/a&gt; and &lt;a href=&quot;https://tf.nist.gov/tf-cgi/servers.cgi&quot;&gt;NTP server&lt;/a&gt;&lt;/strong&gt;. I was considering to incorporate the steps to setup Kubernetes master and worker nodes, but seems too much. Anyway, a real cluster is coming soon! 😎&lt;/p&gt;
&lt;p&gt;Excited? Let’s dive in and make some magic happen. 🔥&lt;/p&gt;
&lt;hr /&gt;
&lt;h1&gt;Create &lt;code&gt;localserver&lt;/code&gt; VM&lt;/h1&gt;
&lt;p&gt;:::info&lt;/p&gt;
&lt;p&gt;This &lt;strong&gt;DNS server&lt;/strong&gt; isn’t to replace CoreDNS in Kubernetes, which is used inside Kubernetes for service discovery. Instead, it’s a &lt;strong&gt;local DNS server&lt;/strong&gt; for VMs to resolve hostnames within the &lt;strong&gt;private network&lt;/strong&gt;. This ensures that all nodes (master and workers) can communicate using &lt;strong&gt;hostnames instead of IP addresses&lt;/strong&gt;, making cluster management smoother. 🚀&lt;/p&gt;
&lt;p&gt;:::&lt;/p&gt;
&lt;h2&gt;Clone from Base Image Rocky 9&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;vmrun clone /Users/geekcoding101.com/Virtual\ Machines.localized/baseimage-rocky9.vmwarevm/baseimage-rocky9.vmx  /Users/geekcoding101.com/Virtual\ Machines.localized/localserver.vmwarevm/localserver.vmx full
sed -i &apos;&apos; &apos;s/displayName = &quot;Clone of baseimage-rocky9&quot;/displayName = &quot;localserver&quot;/&apos; &quot;/Users/geekcoding101.com/Virtual Machines.localized/localserver.vmwarevm/localserver.vmx&quot;
cat &quot;/Users/geekcoding101.com/Virtual Machines.localized/localserver.vmwarevm/localserver.vmx&quot; | grep disp

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Above commands is to clone the base VM image (display name in VMFusion is &lt;code&gt;Clone of baseimage-rocky9&lt;/code&gt;) as a new one, then update the display name of the new VM to &lt;code&gt;localserver&lt;/code&gt; instead of &lt;code&gt;Clone of baseimage-rocky9&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Now, you probably need to run a &lt;strong&gt;scan&lt;/strong&gt; in VMware Fusion to see the newly added VM:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./VMFusion-scan-new-vm.webp&quot; alt=&quot;VMFusion scan new vm&quot; /&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Customize the Local Server VM&lt;/h2&gt;
&lt;p&gt;First, stop the &lt;code&gt;baseimage&lt;/code&gt; VM and start the &lt;code&gt;localserver&lt;/code&gt; VM to avoid network conflict.&lt;/p&gt;
&lt;p&gt;Now we can SSH as &lt;code&gt;root&lt;/code&gt; into the &lt;code&gt;localserver&lt;/code&gt; VM by using the IP&lt;code&gt;172.16.211.3&lt;/code&gt; of the &lt;code&gt;base VM&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Remember the script &lt;code&gt;/opt/share_tools/bin/configure_vm.yml&lt;/code&gt; we created in &lt;a href=&quot;/posts/kubernetes-tutorial-part1&quot;&gt;Ultimate Kubernetes Tutorial - Setting Up a Thriving Multi-Node Cluster on Mac: Part 1&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Let&apos;s preapre the input file &lt;code&gt;/opt/share_tools/init_data/localserver_vm_input.json&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;{
  &quot;hostname&quot;: &quot;localserver&quot;,
  &quot;ip&quot;: &quot;172.16.211.100&quot;,
  &quot;subnet&quot;: &quot;24&quot;,
  &quot;gateway&quot;: &quot;172.16.211.2&quot;,
  &quot;dns1&quot;: &quot;8.8.8.8&quot;,
  &quot;dns2&quot;: &quot;8.8.4.4&quot;,
  &quot;domain&quot;: &quot;dev.geekcoding101local.com&quot;,
  &quot;ansible_key_path&quot;: &quot;~/.ssh/ansible_ed25519&quot;,
  &quot;ssh_key_path&quot;: &quot;~/.ssh/ssh_ed25519&quot;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I would suggest now you &lt;strong&gt;use the VMFusion console to run the following command&lt;/strong&gt; instead of in the SSH terminal, because it will change the IP and might interrupt the SSH connection results script failure:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;ansible-playbook /opt/share_tools/bin/configure_vm.yml -e &quot;input_file_path=/opt/share_tools/init_data/localserver_vm_input.json&quot;

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;As it suggested in the input json file, the script will:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Update hostname to &lt;code&gt;localserver&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Configure the network interface defined in script to &lt;code&gt;172.16.211.100&lt;/code&gt;  (my case is &lt;code&gt;ens160&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Generate SSH keys for both Ansible and normal SSH&lt;/li&gt;
&lt;li&gt;Apply other necessary settings&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Once done, you should be able to connect via SSH with the new IP address.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Setting Up DNS Server&lt;/h2&gt;
&lt;p&gt;Now that the localserver is up, let&apos;s install and configure the &lt;code&gt;DNS server&lt;/code&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;dnf update -y
dnf install bind -y

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now let&apos;s start update the &lt;code&gt;BIND configuration&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;First one is &lt;code&gt;/etc/named.conf&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ cat /etc/named.conf
//
// named.conf
//
// Provided by Red Hat bind package to configure the ISC BIND named(8) DNS
// server as a caching only nameserver (as a localhost DNS resolver only).
//
// See /usr/share/doc/bind*/sample/ for example named configuration files.
//

options {
        listen-on port 53 { 127.0.0.1; 172.16.211.100; };
        listen-on-v6 port 53 { ::1; };
        directory     &quot;/var/named&quot;;
        dump-file     &quot;/var/named/data/cache_dump.db&quot;;
        statistics-file &quot;/var/named/data/named_stats.txt&quot;;
        memstatistics-file &quot;/var/named/data/named_mem_stats.txt&quot;;
        secroots-file    &quot;/var/named/data/named.secroots&quot;;
        recursing-file    &quot;/var/named/data/named.recursing&quot;;
        allow-query     { any; };
        forwarders {
          8.8.8.8;  # Google&apos;s DNS as a fallback
        };

        /*
         - If you are building an AUTHORITATIVE DNS server, do NOT enable recursion.
         - If you are building a RECURSIVE (caching) DNS server, you need to enable
           recursion.
         - If your recursive DNS server has a public IP address, you MUST enable access
           control to limit queries to your legitimate users. Failing to do so will
           cause your server to become part of large scale DNS amplification
           attacks. Implementing BCP38 within your network would greatly
           reduce such attack surface
        */
        recursion yes;

        dnssec-validation yes;

        managed-keys-directory &quot;/var/named/dynamic&quot;;
        geoip-directory &quot;/usr/share/GeoIP&quot;;

        pid-file &quot;/run/named/named.pid&quot;;
        session-keyfile &quot;/run/named/session.key&quot;;

        /* https://fedoraproject.org/wiki/Changes/CryptoPolicy */
        include &quot;/etc/crypto-policies/back-ends/bind.config&quot;;
};

logging {
        channel default_debug {
                file &quot;data/named.run&quot;;
                severity dynamic;
        };
};

zone &quot;.&quot; IN {
    type hint;
    file &quot;named.ca&quot;;
};
zone &quot;dev.geekcoding101local.com&quot; IN {
    type master;
    file &quot;/var/named/dev.geekcoding101local.com.zone&quot;;
    allow-update { none; };
};
zone &quot;211.16.172.in-addr.arpa&quot; IN {
    type master;
    file &quot;/var/named/211.16.172.in-addr.arpa.zone&quot;;
    allow-update { none; };
};

include &quot;/etc/named.rfc1912.zones&quot;;
include &quot;/etc/named.root.key&quot;;

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Second file is DNS zone file &lt;code&gt;/var/named/dev.geekcoding101local.com.zone&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ cat /var/named/dev.geekcoding101local.com.zone
$TTL 86400
@   IN  SOA ns1.dev.geekcoding.com. root.dev.geekcoding101local.com. (
            2024010103  ; Serial
            3600        ; Refresh
            1800        ; Retry
            1209600     ; Expire
            86400 )     ; Minimum TTL

@   IN  NS  localserver.dev.geekcoding101local.com.
localserver IN A 172.16.211.100  ; IP of localserver DNS

k8s-1 IN A 172.16.211.11
k8s-2 IN A 172.16.211.12
k8s-3 IN A 172.16.211.13
k8s-4 IN A 172.16.211.14
k8s-5 IN A 172.16.211.15

devbox IN A 172.16.211.99
; Local server entry
localserver IN A 172.16.211.100

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Third file is &lt;code&gt;/var/named/211.16.172.in-addr.arpa.zone&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ cat /var/named/211.16.172.in-addr.arpa.zone
$TTL 86400
@   IN  SOA ns1.dev.geekcoding.com. root.dev.geekcoding101local.com. (
            2024010103  ; Serial
            3600        ; Refresh
            1800        ; Retry
            1209600     ; Expire
            86400 )     ; Minimum TTL

@   IN  NS  localserver.dev.geekcoding101local.com.

; PTR Records
11  IN  PTR k8s-1.dev.geekcoding101local.com.
12  IN  PTR k8s-2.dev.geekcoding101local.com.
13  IN  PTR k8s-3.dev.geekcoding101local.com.
14  IN  PTR k8s-4.dev.geekcoding101local.com.
15  IN  PTR k8s-5.dev.geekcoding101local.com.
99  IN  PTR devbox.dev.geekcoding101local.com.
100 IN  PTR localserver.dev.geekcoding101local.com.

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;:::warning&lt;/p&gt;
&lt;p&gt;If no reverse zone, you will hit:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ nslookup 172.16.211.100
** server can&apos;t find 100.211.16.172.in-addr.arpa: NXDOMAIN
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;:::&lt;/p&gt;
&lt;p&gt;Let&apos;s run a test on the files for syntax check:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;named-checkzone dev.geekcoding101local.com /var/named/dev.geekcoding101local.com.zone
named-checkzone 211.16.172.in-addr.arpa /var/named/211.16.172.in-addr.arpa.zone
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./named-check-pass.webp&quot; alt=&quot;named-checkzone command pass on kubernetes cluster to test DNS server&quot; title=&quot;named-checkzone command pass on kubernetes cluster to test DNS server&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Restart DNS Service:&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;systemctl restart named
systemctl status named
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./named-status.png&quot; alt=&quot;named service status for DNS server&quot; title=&quot;named service status for DNS server&quot; /&gt;&lt;/p&gt;
&lt;p&gt;:::info&lt;/p&gt;
&lt;p&gt;You might see &lt;code&gt;Unable to fetch DNSKEY&lt;/code&gt; error in above, we can ignore, as we don&apos;t need DNSKEY.&lt;/p&gt;
&lt;p&gt;:::&lt;/p&gt;
&lt;p&gt;Let&apos;s run a &lt;code&gt;nslookup&lt;/code&gt; test:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./nslookup-test-localserver.webp&quot; alt=&quot;DNS server nslookup test localserver in kubernetes environment&quot; title=&quot;DNS server nslookup test localserver in kubernetes environment&quot; /&gt;&lt;/p&gt;
&lt;p&gt;:::warning&lt;/p&gt;
&lt;p&gt;Every time after modifying a DNS zone file, we need to increment the serial number.&lt;/p&gt;
&lt;p&gt;The serial number is in the format YYYYMMDD##.&lt;/p&gt;
&lt;p&gt;For example, if the current serial number is 2024010101, and you&apos;re making a second change on the same day, update it to 2024010102.&lt;/p&gt;
&lt;p&gt;:::&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Setting Up NTP Server&lt;/h2&gt;
&lt;p&gt;To ensure time synchronization across all nodes just in case internet issue as we&apos;re running all nodes on my laptop, let&apos;s setup NTP server &lt;code&gt;Chrony&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;:::info&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Chrony&lt;/code&gt; is an implementation of the &lt;a href=&quot;https://en.wikipedia.org/wiki/Network_Time_Protocol&quot;&gt;Network Time Protocol (NTP)&lt;/a&gt;. It is an alternative to &lt;a href=&quot;https://linux.die.net/man/8/ntpd&quot;&gt;ntpd&lt;/a&gt;, a reference implementation of NTP.&lt;/p&gt;
&lt;p&gt;:::&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;dnf install chrony -y

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Modify the configuration, actually just one line change:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[root@localserver ~]# cat /etc/chrony.conf
...
allow 172.16.211.0/24
...

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;Start the NTP Service:&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;systemctl restart chronyd
systemctl status chronyd

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Verify with:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ chronyc sources -v

  .-- Source mode  &apos;^&apos; = server, &apos;=&apos; = peer, &apos;#&apos; = local clock.
 / .- Source state &apos;*&apos; = current best, &apos;+&apos; = combined, &apos;-&apos; = not combined,
| /             &apos;x&apos; = may be in error, &apos;~&apos; = too variable, &apos;?&apos; = unusable.
||                                                 .- xxxx [ yyyy ] +/- zzzz
||      Reachability register (octal) -.           |  xxxx = adjusted offset,
||      Log2(Polling interval) --.      |          |  yyyy = measured offset,
||                                \     |          |  zzzz = estimated error.
||                                 |    |           \
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
^? 65-100-46-164.dia.static&amp;gt;     1   6   377    16    -33ms[  -33ms] +/- 47ms
^? ntp3.radio-sunshine.org       2   6   377    15    -81ms[  -81ms] +/- 120ms
^? server.slakjd.com             3   6   377    16    -71ms[  -71ms] +/- 44ms
^? kjsl-fmt2-net.fmt2.kjsl.&amp;gt;     2   6   377    16    -65ms[  -65ms] +/- 8139us
^? localserver.dev.geekcodi&amp;gt;     0   6     0     - +0ns[   +0ns] +/- 0ns
[root@localserver ~]#
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;h1&gt;Wrapping Up&lt;/h1&gt;
&lt;p&gt;At this point, our &lt;code&gt;localserver&lt;/code&gt; is now running &lt;strong&gt;DNS and NTP services&lt;/strong&gt; 🚀&lt;/p&gt;
&lt;p&gt;In &lt;strong&gt;Part 3&lt;/strong&gt;, I will:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Configure a &lt;strong&gt;Kubernetes base image&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Spin up the master node and 4 worker nodes with the Kubernetes base image&lt;/li&gt;
&lt;li&gt;Setup the &lt;strong&gt;K8s Master Node&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Join the worker nodes to the cluster&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Stay tuned, and let’s keep this cluster rolling! 🚀🔥&lt;/p&gt;
&lt;p&gt;:::info
You&apos;re on a roll! Don&apos;t stop now—check out the full series and level up your Kubernetes skills. Each post builds on the last, so make sure you haven’t missed anything! 👇&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/kubernetes-tutorial-part1&quot;&gt;Part 1&lt;/a&gt;&lt;/strong&gt;, I laid out the &lt;strong&gt;networking plan&lt;/strong&gt;, my &lt;strong&gt;goals for setting up Kubernetes&lt;/strong&gt;, and how to &lt;strong&gt;prepare a base VM image&lt;/strong&gt; for the cluster.&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/tutorial-part2-dns-server-ntp&quot;&gt;Part 2&lt;/a&gt;&lt;/strong&gt;, current post.&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/part3-kubernetes-cluster-setup&quot;&gt;Part 3&lt;/a&gt;&lt;/strong&gt;, I finished the Kubernetes cluster setup with Flannel, got one Kubernetes master and 4 worker nodes that’s ready for real workloads.&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/part3-kubernetes-cluster-setup&quot;&gt;Part 4&lt;/a&gt;&lt;/strong&gt;, I explored NodePort and ClusterIP,understood the key differences, use cases, and when to choose each for internal and external service access!🔥&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/externalname-loadbalancer-5&quot;&gt;Part 5&lt;/a&gt;&lt;/strong&gt;, explored how to use externalName and LoadBalancer and how to run load testing with tool &lt;code&gt;hey&lt;/code&gt;.
:::&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Ultimate Kubernetes Tutorial Part 3: A Streamlined Kubernetes cluster setup</title><link>https://geekcoding101.com/posts/part3-kubernetes-cluster-setup</link><guid isPermaLink="true">https://geekcoding101.com/posts/part3-kubernetes-cluster-setup</guid><pubDate>Sun, 09 Mar 2025 00:00:00 GMT</pubDate><content:encoded>&lt;h1&gt;Introduction&lt;/h1&gt;
&lt;p&gt;Welcome back to the &lt;a href=&quot;http://localhost:4321/tags/kubernetes&quot;&gt;&lt;strong&gt;Kubernetes tutorial&lt;/strong&gt;&lt;/a&gt; series! Now that our &lt;a href=&quot;/posts/kubernetes-tutorial-part1&quot;&gt;base image&lt;/a&gt; and &lt;a href=&quot;/posts/tutorial-part2-dns-server-ntp&quot;&gt;local server&lt;/a&gt; are ready, it’s time for the real action—&lt;strong&gt;Kubernetes cluster setup with Flannel&lt;/strong&gt;. I&apos;ll spin up one &lt;strong&gt;Kubernetes master and 4 worker nodes&lt;/strong&gt;, forming a &lt;strong&gt;local Kubernetes cluster&lt;/strong&gt; that’s ready for real workloads. No more theory—let’s build something real! 🚀&lt;/p&gt;
&lt;hr /&gt;
&lt;h1&gt;Clone baseimage to k8s-1 as The Kubernetes VM Base Image&lt;/h1&gt;
&lt;p&gt;Before jump on our Kubernetes cluster setup, let&apos;s start from my &lt;strong&gt;Mac&apos;s terminal&lt;/strong&gt;, clone from Base Image - Rocky 9 as k8s-base:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ vmrun clone /Users/geekcoding101.com/Virtual\ Machines.localized/baseimage-rocky9.vmwarevm/baseimage-rocky9.vmx  /Users/geekcoding101.com/Virtual\ Machines.localized/k8s-1.vmwarevm/k8s-1.vmx full
❯ sed -i &apos;&apos; &apos;s/displayName = &quot;Clone of baseimage-rocky9&quot;/displayName = &quot;k8s-1&quot;/&apos; &quot;/Users/geekcoding101.com/Virtual Machines.localized/k8s-1.vmwarevm/k8s-1.vmx&quot;

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Make sure you&apos;ve stopped the baseimage VM, start the k8s-base VM.&lt;/p&gt;
&lt;p&gt;The steps here I&apos;ve mentioned details in &lt;a href=&quot;/posts/tutorial-part2-dns-server-ntp#Clone_from_Base_Image_Rocky_9&quot;&gt;Part 2&lt;/a&gt;, in short, after above command, we need to rescan in VMFusion and SSH as &lt;code&gt;root&lt;/code&gt; into the k8s-base using the IP&lt;code&gt;172.16.211.3&lt;/code&gt; of the &lt;code&gt;base VM&lt;/code&gt;, preapre the input file &lt;code&gt;/opt/share_tools/init_data/k8s-1_vm_input.json&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;{
  &quot;hostname&quot;: &quot;k8s-1&quot;,
  &quot;ip&quot;: &quot;172.16.8.11&quot;,
  &quot;subnet&quot;: &quot;24&quot;,
  &quot;gateway&quot;: &quot;172.16.211.2&quot;,
  &quot;dns1&quot;: &quot;172.16.211.100&quot;,
  &quot;dns2&quot;: &quot;8.8.8.8&quot;,
  &quot;domain&quot;: &quot;dev.geekcoding101local.com&quot;,
  &quot;ansible_key_path&quot;: &quot;~/.ssh/ansible_ed25519&quot;,
  &quot;ssh_key_path&quot;: &quot;~/.ssh/ssh_ed25519&quot;
}

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then using VMFusion console to login into the VM, perform below command to generate SSH keys and setup networking:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;ansible-playbook /opt/share_tools/bin/configure_vm.yml -e &quot;input_file_path=/opt/share_tools/init_data/k8s-1_vm_input.json&quot;

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now I can connect from SSH passwordlessly via the new IP &lt;code&gt;172.16.8.11&lt;/code&gt;.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Test DNS&lt;/h2&gt;
&lt;p&gt;Please note here is testing our local DNS server to ensure it&apos;s working in our Kubernetes cluster setup. But it&apos;s not going to replace CoreDNS...&lt;/p&gt;
&lt;p&gt;Anyway, ensure the DNS server &lt;code&gt;localserver(172.16.211.100)&lt;/code&gt; we setup in &lt;a href=&quot;/posts/tutorial-part2-dns-server-ntp#Clone_from_Base_Image_Rocky_9&quot;&gt;Part 2&lt;/a&gt; is running.&lt;/p&gt;
&lt;p&gt;Ensure the &lt;code&gt;172.16.211.100&lt;/code&gt; is on top of  &lt;code&gt;/etc/resolv.conf&lt;/code&gt; , should be same as below:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ cat /etc/resolv.conf
# Generated by NetworkManager
search localdomain dev.geekcoding101local.com
nameserver 172.16.211.100
nameserver 8.8.8.8
nameserver 172.16.68.2
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;172.16.68.2&lt;/code&gt; is assigned by &lt;code&gt;ens224&lt;/code&gt;, the network adapter we added into VM for internet access. Because&lt;strong&gt;VMware Fusion&lt;/strong&gt; typically assigns &lt;strong&gt;&lt;code&gt;172.16.68.1&lt;/code&gt; and &lt;code&gt;172.16.68.2&lt;/code&gt; as DNS servers&lt;/strong&gt; for virtual machines when using &lt;strong&gt;NAT (Network Address Translation)&lt;/strong&gt; networking.&lt;/p&gt;
&lt;p&gt;Test DNS and hostname as below as &lt;code&gt;root&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;nslookup k8s-1
nslookup k8s-1.dev.geekcoding101local.com
hostname -f
hostname -s

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./image-1.png&quot; alt=&quot;Kubernetes cluster setup: Test DNS&quot; title=&quot;Kubernetes cluster setup: Test DNS&quot; /&gt;&lt;/p&gt;
&lt;p&gt;:::warning
However, if you have &lt;code&gt;172.16.68.2&lt;/code&gt;  on top of &lt;code&gt;/etc/resolv.conf&lt;/code&gt;, you would hit:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./nslookup-fail.webp&quot; alt=&quot;nslookup fail on k8s-1&quot; /&gt;&lt;/p&gt;
&lt;p&gt;I don&apos;t recommend to manually update &lt;code&gt;/etc/resolv.conf&lt;/code&gt;  to fix it as you&apos;ve seen &quot;Generated by NetworkManager&quot;.&lt;/p&gt;
&lt;p&gt;The reason why &lt;code&gt;172.16.68.2&lt;/code&gt; is ahead of &lt;code&gt;172.16.211.100&lt;/code&gt; is the order of nmcli command on the network adatper &lt;code&gt;ens160&lt;/code&gt; and &lt;code&gt;ens224&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Now if we shut down &lt;code&gt;ens224&lt;/code&gt; and bring up it again,&lt;code&gt;172.16.68.2&lt;/code&gt; will be shown at the bottom of &lt;code&gt;/etc/resolv.conf&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;nmcli dev down ens224
nmcli dev up ens224

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./nslookup-start-work.webp&quot; alt=&quot;nslookup start work&quot; /&gt;&lt;/p&gt;
&lt;p&gt;:::&lt;/p&gt;
&lt;hr /&gt;
&lt;h1&gt;Setup Docker&lt;/h1&gt;
&lt;p&gt;In &lt;code&gt;k8s-1&lt;/code&gt; SSH session, now we can install common packages required by Kubernetes cluster setup.&lt;/p&gt;
&lt;p&gt;In this Kubernetes cluster setup, we will use &lt;code&gt;k8s-1&lt;/code&gt; as a base image, so we can easily clone it as &lt;code&gt;k8s-2&lt;/code&gt;, &lt;code&gt;k8s-3&lt;/code&gt;, &lt;code&gt;k8s-4&lt;/code&gt; and &lt;code&gt;k8s-5&lt;/code&gt; without repeat the common packages installation!&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;dnf update -y

dnf config-manager --add-repo=https://download.docker.com/linux/centos/docker-ce.repo
dnf install docker-ce docker-ce-cli containerd.io socat -y
systemctl enable --now docker

systemctl start docker
systemctl status docker

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./start-docker.webp&quot; alt=&quot;start docker&quot; /&gt;&lt;/p&gt;
&lt;p&gt;:::info
&lt;code&gt;socat&lt;/code&gt; is required by kubeadmin init which we will run later, without it, you will hit a warning as below:&lt;img src=&quot;./socat-required.png&quot; alt=&quot;socat is required&quot; /&gt;
:::&lt;/p&gt;
&lt;p&gt;:::warning
&lt;a href=&quot;https://docs.docker.com/engine/install/&quot;&gt;&lt;code&gt;Docker-ce&lt;/code&gt; and &lt;code&gt;docker-ce-cli&lt;/code&gt;&lt;/a&gt; are not necessary and deprecated since Kubernetes 1.20+, as Kubernetes no longer relies on Docker as its container runtime. However, &lt;code&gt;Docker&lt;/code&gt; can still be useful in a development environment, offering a full containerization platform that includes tools for building images, managing networks, volumes, and simple orchestration.&lt;/p&gt;
&lt;p&gt;But it will introduce a problem in &lt;code&gt;/etc/containerd/config.toml&lt;/code&gt; that it will disable cri like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;disabled_plugins = [&quot;cri&quot;]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This has an important impact on how containerd integrates with Kubernetes.&lt;/p&gt;
&lt;p&gt;The CRI plugin is specifically needed when we&apos;re using &lt;a href=&quot;https://containerd.io/&quot;&gt;&lt;strong&gt;containerd&lt;/strong&gt;&lt;/a&gt; as our container runtime for Kubernetes. The CRI plugin enables containerd to communicate with Kubernetes by implementing the &lt;strong&gt;Container Runtime Interface (CRI)&lt;/strong&gt;, which Kubernetes uses to manage containers. When using Docker, it will not need cri, that&apos;s why above docker installation disabled it. But I&apos;ve mentioned previously why I want to have docker and I want to use&lt;a href=&quot;https://containerd.io/&quot;&gt;&lt;strong&gt;containerd&lt;/strong&gt;&lt;/a&gt; as our container runtime, so let&apos;s just fix the configuration issue.&lt;/p&gt;
&lt;p&gt;Just regenerate the &lt;code&gt;config.toml&lt;/code&gt; which will enable cri plugin by default, but needs to manually update &lt;code&gt;SystemdCgroup&lt;/code&gt; to &lt;code&gt;true&lt;/code&gt; via below commands:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;containerd config default | sudo tee /etc/containerd/config.toml
sed -i &apos;s/SystemdCgroup = false/SystemdCgroup = true/&apos; /etc/containerd/config.toml

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Kubernetes uses cgroups for managing resources like CPU, memory, and I/O for containers. If &lt;code&gt;SystemdCgroup = true&lt;/code&gt;, &lt;code&gt;containerd&lt;/code&gt; integrates with systemd to manage cgroups, which is the preferred method on most modern Linux distributions using systemd as their init system (e.g., Rocky Linux, etc.).&lt;/p&gt;
&lt;p&gt;:::&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Execute &lt;code&gt;Docker&lt;/code&gt; Command Without sudo As Non-root Accounts&lt;/h2&gt;
&lt;p&gt;If running as a non-root account, you will encounter this permission error:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[admin@k8s-01 ~]$ docker ps
permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get &quot;http://%2Fvar%2Frun%2Fdocker.sock/v1.47/containers/json&quot;: dial unix /var/run/docker.sock: connect: permission denied
[admin@k8s-01 ~]$
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So let&apos;s add the use into &lt;code&gt;docker&lt;/code&gt; group:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[admin@k8s-1 ~]$ sudo usermod -aG docker $(whoami)

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Log out of the current session and log in again as &lt;code&gt;admin&lt;/code&gt; and you should see &lt;code&gt;docker ps&lt;/code&gt; started working now.&lt;/p&gt;
&lt;p&gt;Let&apos;s test docker (how about switch back to &lt;code&gt;root&lt;/code&gt; account to perform the testing...):&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;docker run hello-world

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./test-docker-command-as-admin.webp&quot; alt=&quot;test docker command as admin account&quot; /&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;h1&gt;Configure OS To Support Kubernetes&lt;/h1&gt;
&lt;h2&gt;Disable swap for Kubernetes&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;swapoff -a
sed -i &apos;/swap/d&apos; /etc/fstab

&lt;/code&gt;&lt;/pre&gt;
&lt;hr /&gt;
&lt;h2&gt;Configure Linux kernel&apos;s networking parameters&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;br_netfilter&lt;/code&gt;: Kubernetes uses network bridges to connect Pods, and the &lt;code&gt;br_netfilter&lt;/code&gt; module ensures that iptables can see and manipulate bridged traffic. This is essential for Kubernetes&apos; internal networking (such as inter-Pod communication and service routing).&lt;/p&gt;
&lt;p&gt;Overlay in &lt;code&gt;/etc/modules-load.d/k8s.conf&lt;/code&gt;: is not needed on rocky 9. Because I found it&apos;s already loaded:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ modinfo overlay
filename:       /lib/modules/5.14.0-427.37.1.el9_4.x86_64/kernel/fs/overlayfs/overlay.ko.xz
alias:          fs-overlay
license:        GPL
description:    Overlay filesystem
author:         Miklos Szeredi &amp;lt;miklos@szeredi.hu&amp;gt;
rhelversion:    9.4
srcversion:     6DB4565DD58AB453DBFAD2A
depends:
retpoline:      Y
intree:         Y
name:           overlay
vermagic:       5.14.0-427.37.1.el9_4.x86_64 SMP preempt mod_unload modversions
sig_id:         PKCS#7
signer:         Rocky kernel signing key
sig_key:        52:A7:4C:F4:7A:B4:B1:12:D3:1E:72:33:0A:0D:49:8B:C3:34:88:DC
sig_hashalgo:   sha256
signature:      18:AF:F5:F2:12:80:5A:92:B3:5E:29:B2:A5:10:E8:27:90:73:B4:B2:
                25:B0:04:42:2B:28:FF:86:50:0D:82:CA:12:68:93:70:9F:04:C5:3C:
                19:B2:29:47:41:DD:7F:1D:33:18:33:B7:50:2C:30:A4:0D:CB:1E:53:
                4A:66:B8:BF:CB:41:F8:89:3E:5E:CA:63:8B:0C:2F:CD:42:AD:63:9D:
                C4:6A:31:FD:4B:46:0C:33:38:5A:BA:11:B0:66:76:BF:54:7B:B7:63:
                35:1B:76:52:D2:04:BF:83:65:A7:C6:0D:D1:CB:96:BF:60:37:54:37:
                3E:1B:76:69:9C:2F:8F:8D:81:21:88:33:96:EA:E6:C3:97:D1:1E:8F:
                BC:BD:70:82:27:2A:F3:8C:11:1D:AC:AC:13:00:F6:CD:00:BD:6C:3E:
                40:6F:F2:54:9C:E3:62:A7:17:78:4C:3C:43:A0:49:4D:61:FE:FD:A6:
                CD:51:5F:E6:F3:47:B7:70:D4:5E:55:3C:B8:8C:D5:45:81:6F:47:E4:
                80:39:E1:BA:0D:79:21:64:A6:7E:4D:ED:59:09:F1:26:D2:06:98:E5:
                EB:E5:B1:58:F5:AF:89:0B:0E:8B:65:EB:2A:83:30:48:FD:AC:48:AB:
                12:39:EF:3C:BB:DA:CC:26:F8:38:7F:C8:2D:15:7D:4D:3A:E6:8F:AA:
                AB:16:79:39:2D:2E:9D:5B:76:29:6F:BE:74:4E:65:F5:1F:01:43:58:
                DE:12:54:B5:C7:9E:A5:4C:B0:1D:5E:9B:05:AF:CF:B8:33:28:B4:8E:
                6E:A1:E1:58:7D:CC:F2:61:51:EA:B1:C0:BD:BE:02:56:43:6D:5A:67:
                D7:F0:25:02:91:70:74:AE:F4:6F:D3:E9:9A:1E:D0:DD:BA:C2:3C:B3:
                07:C4:F3:AD:37:63:6B:2B:B9:1D:FB:0B:CC:0B:B7:E3:14:EA:2E:28:
                D7:56:97:88:91:A5:3F:59:5D:21:7E:88:EA:AB:49:E3:3B:77:5B:F3:
                9F:56:EE:46
parm:           check_copy_up:Obsolete; does nothing
parm:           redirect_max:Maximum length of absolute redirect xattr value (ushort)
parm:           redirect_dir:Default to on or off for the redirect_dir feature (bool)
parm:           redirect_always_follow:Follow redirects even if redirect_dir feature is turned off (bool)
parm:           index:Default to on or off for the inodes index feature (bool)
parm:           nfs_export:Default to on or off for the NFS export feature (bool)
parm:           xino_auto:Auto enable xino feature (bool)
parm:           metacopy:Default to on or off for the metadata only copy up feature (bool)
❯ cd /lib/modules/$(uname -r)/kernel/fs/overlayfs
❯ ls
overlay.ko.xz

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;bridge-nf-call-iptables&lt;/code&gt;: Without this, if &lt;code&gt;iptables&lt;/code&gt; is not configured to handle bridged traffic, the network policies and traffic filtering between pods and services may not work correctly.&lt;/p&gt;
&lt;p&gt;So now perform the configuration for above:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ modprobe br_netfilter
❯ echo &apos;1&apos; &amp;gt; /proc/sys/net/bridge/bridge-nf-call-iptables
❯ tee /etc/modules-load.d/k8s.conf &amp;lt;&amp;lt;EOF
br_netfilter
EOF
br_netfilter
❯ tee /etc/sysctl.d/k8s.conf &amp;lt;&amp;lt;EOF
net.ipv4.ip_forward = 1 
net.bridge.bridge-nf-call-ip6tables = 1 
net.bridge.bridge-nf-call-iptables = 1
EOF
❯ sysctl --system
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You might see some articles configured &lt;code&gt;&quot;ip_vs, ip_vs_rr, ip_vs_wrr, ip_vs_sh&quot;&lt;/code&gt; in modules-load.d/k8s.conf, however, they&apos;re required by &lt;a href=&quot;https://kubernetes.io/blog/2018/07/09/ipvs-based-in-cluster-load-balancing-deep-dive/&quot;&gt;ipvs&lt;/a&gt; but we&apos;re suing iptables. Using iptables is easier comparing with ipvs in development environment. In a scale environment, iptables struggles to scale to tens of thousands of Services because it is designed purely for firewalling purposes and is based on in-kernel rule lists.&lt;/p&gt;
&lt;p&gt;I&apos;ve seen several artiles talking Kubernetes cluster setup disabled firewalld, just want to remind that, disabling firewalld does not affect the need for proper kernel network module configuration as shown in above and network filtering for bridged traffic. The above commands are critical for Kubernetes networking.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Install Kubernetes Packages&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;cat &amp;lt;&amp;lt;EOF | tee /etc/yum.repos.d/k8s.repo
[kubernetes]
name=Kubernetes
baseurl=https://pkgs.k8s.io/core:/stable:/v1.29/rpm/
enabled=1
gpgcheck=1
gpgkey=https://pkgs.k8s.io/core:/stable:/v1.29/rpm/repodata/repomd.xml.key
exclude=kubelet kubeadm kubectl cri-tools kubernetes-cni
EOF

dnf makecache
# disableexcludes ensures that packages from the Kubernetes repository are not excluded during installation.
dnf install -y kubelet kubeadm kubectl --disableexcludes=kubernetes    

systemctl enable kubelet &amp;amp;&amp;amp; systemctl start kubelet &amp;amp;&amp;amp; systemctl status kubelet
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./kubernetes-kubelet-status.png&quot; alt=&quot;kubernetes kubelet status&quot; /&gt;&lt;/p&gt;
&lt;p&gt;:::warning
Don’t worry about any kubelet errors at this point (you might see in the output from &lt;code&gt;systemctl status kubelet&lt;/code&gt;, I should have captured the full screenshot in above). Once the worker nodes successfully join the Kubernetes cluster, the kubelet service will automatically activate and start communicating with the control plane.
:::&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Enable Firewalld&lt;/h2&gt;
&lt;p&gt;You remember, we&apos;ve disabled &lt;code&gt;firewalld&lt;/code&gt; in &lt;a href=&quot;/posts/kubernetes-tutorial-part1&quot;&gt;Part 1: Setting Up a Thriving Multi-Node Cluster on Mac&lt;/a&gt; ? We need to enable it now in our Kuberentes cluster setup as I want my environment running in a production-similar environment.&lt;/p&gt;
&lt;p&gt;:::info
Even if firewalld is disabled, Kubernetes still needs proper network configurations for bridged traffic.
:::&lt;/p&gt;
&lt;h3&gt;Open Required Ports&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Port(s)&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;6443&lt;/td&gt;
&lt;td&gt;Kubernetes API server&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2379-2380&lt;/td&gt;
&lt;td&gt;etcd server client API&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10250&lt;/td&gt;
&lt;td&gt;Kubelet API&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10251&lt;/td&gt;
&lt;td&gt;kube-scheduler&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10252&lt;/td&gt;
&lt;td&gt;kube-controller-manager&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10255&lt;/td&gt;
&lt;td&gt;Read-only Kubelet API&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5473&lt;/td&gt;
&lt;td&gt;Cluster Control Plane Config API&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;pre&gt;&lt;code&gt;systemctl unmask firewalld
systemctl start firewalld

firewall-cmd --zone=public --permanent --add-port=6443/tcp
firewall-cmd --zone=public --permanent --add-port=2379-2380/tcp
firewall-cmd --zone=public --permanent --add-port=10250/tcp
firewall-cmd --zone=public --permanent --add-port=10251/tcp
firewall-cmd --zone=public --permanent --add-port=10252/tcp
firewall-cmd --zone=public --permanent --add-port=10255/tcp
firewall-cmd --zone=public --permanent --add-port=5473/tcp

firewall-cmd --zone=public --permanent --list-ports

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;:::info
Docker manages its own iptables rules independently, even if firewalld is disabled. However, you still need to open specific Kubernetes-related ports to allow communication between control-plane and worker nodes.
:::&lt;/p&gt;
&lt;p&gt;:::warning
If you see firewall warnings after reboot, like this:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./firewalld-start-warning.png&quot; alt=&quot;warning of firewalld when start after reboot system&quot; /&gt;&lt;/p&gt;
&lt;p&gt;It’s likely because Docker service was not yet up during startup. Let&apos;s ignore it.
:::&lt;/p&gt;
&lt;h1&gt;Pull Images with crictl&lt;/h1&gt;
&lt;p&gt;As admin account, perform:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;sudo kubeadm config images pull
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;:::warning
You might hit:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ sudo kubeadm config images pull
W0308 22:04:13.872981  109867 version.go:104] could not fetch a Kubernetes version from the internet: unable to get URL &quot;https://dl.k8s.io/release/stable-1.txt&quot;: Get &quot;https://cdn.dl.k8s.io/release/stable-1.txt&quot;: context deadline exceeded (Client.Timeout exceeded while awaiting headers)
W0308 22:04:13.873156  109867 version.go:105] falling back to the local client version: v1.29.14
failed to pull image &quot;registry.k8s.io/kube-apiserver:v1.29.14&quot;: output: time=&quot;2025-03-08T22:04:14-08:00&quot; level=fatal msg=&quot;validate service connection: validate CRI v1 image API for endpoint \&quot;unix:///var/run/containerd/containerd.sock\&quot;: rpc error: code = Unimplemented desc = unknown service runtime.v1.ImageService&quot;
, error: exit status 1
To see the stack trace of this error execute with --v=5 or higher
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Solution is to create the file &lt;code&gt;sudo vim /etc/crictl.yaml&lt;/code&gt; manually with below content:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;runtime-endpoint: unix:///run/containerd/containerd.sock
image-endpoint: unix:///run/containerd/containerd.sock
timeout: 10
debug: false

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then restart &lt;code&gt;containerd&lt;/code&gt; service:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;sudo systemctl restart containerd
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;:::&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./kubeadm-config-images-pull.webp&quot; alt=&quot;kubeadm config images pull&quot; /&gt;&lt;/p&gt;
&lt;p&gt;So now we have Kubernetes images for our Kubernetes cluster setup.&lt;/p&gt;
&lt;p&gt;Hey, just want to remind here, we can only use &lt;code&gt;crictl&lt;/code&gt; to manage images for Kubernetes, because our Kubernetes using containerd instead of Docker as the container runtime. And &lt;code&gt;crictl&lt;/code&gt; needs to access &lt;code&gt;/run/containerd/containerd.sock&lt;/code&gt; owned by &lt;code&gt;root:root&lt;/code&gt;, so please remember to use &lt;code&gt;sudo&lt;/code&gt; if you logged in as non-root account:&lt;/p&gt;
&lt;p&gt;[caption id=&quot;attachment_4915&quot; align=&quot;aligncenter&quot; width=&quot;1468&quot;]&lt;img src=&quot;./crictl-images-first-time.webp&quot; alt=&quot;run crictl images for the first time&quot; /&gt; Checking Docker images (it only has the &lt;code&gt;hello-world&lt;/code&gt; image which pulled before when we&apos;re testing docker):[/caption]&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ sudo docker images
[sudo] password for admin:
REPOSITORY    TAG       IMAGE ID       CREATED         SIZE
hello-world   latest    d2c94e258dcb   17 months ago   13.3kB
&lt;/code&gt;&lt;/pre&gt;
&lt;hr /&gt;
&lt;h1&gt;Create Kubernetes Worker Nodes&lt;/h1&gt;
&lt;p&gt;We have get our Kubernetes base image &lt;code&gt;k8s-1&lt;/code&gt; ready!&lt;/p&gt;
&lt;p&gt;Before we further configure &lt;code&gt;k8s-1&lt;/code&gt; as our master node, now it&apos;s time shutdown it and clone it as &lt;code&gt;k8s-base-image&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ vmrun clone /Users/geekcoding101.com/Virtual\ Machines.localized/k8s-1.vmwarevm/k8s-1.vmx  /Users/geekcoding101.com/Virtual\ Machines.localized/k8s-base-image.vmwarevm/k8s-base-image.vmx full

❯ sed -i &apos;&apos; &apos;s/displayName = &quot;Clone of k8s-1&quot;/displayName = &quot;k8s-base-image&quot;/&apos; &quot;/Users/geekcoding101.com/Virtual Machines.localized/k8s-base-image.vmwarevm/k8s-base-image.vmx&quot;

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./vmfusino-vm-list.webp&quot; alt=&quot;vmfusion vm list&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Then repeat the steps mentioned in &lt;a href=&quot;/?p=4903&amp;amp;preview=true#Clone_baseimage_to_k8s-1_as_The_Kubernetes_VM_Base_Image&quot;&gt;Clone baseimage to k8s-1 as The Kubernetes VM Base Image&lt;/a&gt;, clone &lt;code&gt;k8s-base-image&lt;/code&gt; to &lt;code&gt;k8s-2&lt;/code&gt;, &lt;code&gt;k8s-3&lt;/code&gt;, &lt;code&gt;k8s-4&lt;/code&gt; and &lt;code&gt;k8s-5&lt;/code&gt; in our Kubernetes cluster setup.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;vmrun clone /Users/geekcoding101.com/Virtual\ Machines.localized/k8s-1.vmwarevm/k8s-1.vmx  /Users/geekcoding101.com/Virtual\ Machines.localized/k8s-base-image.vmwarevm/k8s-base-image.vmx full
sed -i &apos;&apos; &apos;s/displayName = &quot;Clone of k8s-1&quot;/displayName = &quot;k8s-base-image&quot;/&apos; &quot;/Users/geekcoding101.com/Virtual Machines.localized/k8s-base-image.vmwarevm/k8s-base-image.vmx&quot;

vmrun clone /Users/geekcoding101.com/Virtual\ Machines.localized/k8s-base-image.vmwarevm/k8s-base-image.vmx  /Users/geekcoding101.com/Virtual\ Machines.localized/k8s-2.vmwarevm/k8s-2.vmx full
sed -i &apos;&apos; &apos;s/displayName = &quot;Clone of k8s-base-image&quot;/displayName = &quot;k8s-2&quot;/&apos; &quot;/Users/geekcoding101.com/Virtual Machines.localized/k8s-2.vmwarevm/k8s-2.vmx&quot;

vmrun clone /Users/geekcoding101.com/Virtual\ Machines.localized/k8s-base-image.vmwarevm/k8s-base-image.vmx  /Users/geekcoding101.com/Virtual\ Machines.localized/k8s-3.vmwarevm/k8s-3.vmx full
sed -i &apos;&apos; &apos;s/displayName = &quot;Clone of k8s-base-image&quot;/displayName = &quot;k8s-3&quot;/&apos; &quot;/Users/geekcoding101.com/Virtual Machines.localized/k8s-3.vmwarevm/k8s-3.vmx&quot;

vmrun clone /Users/geekcoding101.com/Virtual\ Machines.localized/k8s-base-image.vmwarevm/k8s-base-image.vmx  /Users/geekcoding101.com/Virtual\ Machines.localized/k8s-4.vmwarevm/k8s-4.vmx full
sed -i &apos;&apos; &apos;s/displayName = &quot;Clone of k8s-base-image&quot;/displayName = &quot;k8s-4&quot;/&apos; &quot;/Users/geekcoding101.com/Virtual Machines.localized/k8s-4.vmwarevm/k8s-4.vmx&quot;

vmrun clone /Users/geekcoding101.com/Virtual\ Machines.localized/k8s-base-image.vmwarevm/k8s-base-image.vmx  /Users/geekcoding101.com/Virtual\ Machines.localized/k8s-5.vmwarevm/k8s-5.vmx full
sed -i &apos;&apos; &apos;s/displayName = &quot;Clone of k8s-base-image&quot;/displayName = &quot;k8s-5&quot;/&apos; &quot;/Users/geekcoding101.com/Virtual Machines.localized/k8s-5.vmwarevm/k8s-5.vmx&quot;

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Recan in VMFusion, you will see:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./vmfusino-vm-list-with-kubernetes-all-nodes.webp&quot; alt=&quot;vmfusino vm list with kubernetes all nodes&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Remember to customize each node one by one with different &lt;code&gt;input.json&lt;/code&gt; (A kind reminder, before configuring network, &lt;code&gt;k8s-2&lt;/code&gt; to &lt;code&gt;k8s-5&lt;/code&gt; will use the same IP &lt;code&gt;172.16.211.11&lt;/code&gt; of &lt;code&gt;k8s-1&lt;/code&gt;, so please shutdown &lt;code&gt;k8s-1&lt;/code&gt; before finish configuration on &lt;code&gt;k8s-2&lt;/code&gt; to &lt;code&gt;k8s-5&lt;/code&gt;, and &lt;code&gt;localserver&lt;/code&gt; should be running as well). I put the all &lt;code&gt;input.json&lt;/code&gt; here so you can copy it (My bad! I should have prepared those files in &lt;a href=&quot;/posts/kubernetes-tutorial-part1#Creating_the_Rocky_9_Base_VM&quot;&gt;baseimage&lt;/a&gt;!):&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ ls /opt/share_tools/init_data
devbox_vm_input.json  k8s-1_vm_input.json  k8s-2_vm_input.json  k8s-3_vm_input.json  k8s-4_vm_input.json  k8s-5_vm_input.json  localserver_vm_input.json
❯ cat /opt/share_tools/init_data/k8s-2_vm_input.json
{
  &quot;hostname&quot;: &quot;k8s-2&quot;,
  &quot;ip&quot;: &quot;172.16.211.12&quot;,
  &quot;subnet&quot;: &quot;24&quot;,
  &quot;gateway&quot;: &quot;172.16.211.2&quot;,
  &quot;dns1&quot;: &quot;172.168.211.100&quot;,
  &quot;dns2&quot;: &quot;8.8.8.8&quot;,
  &quot;domain&quot;: &quot;dev.geekcoding101local.com&quot;,
  &quot;ansible_key_path&quot;: &quot;~/.ssh/ansible_ed25519&quot;,
  &quot;ssh_key_path&quot;: &quot;~/.ssh/ssh_ed25519&quot;
}

❯ cat /opt/share_tools/init_data/k8s-3_vm_input.json
{
  &quot;hostname&quot;: &quot;k8s-3&quot;,
  &quot;ip&quot;: &quot;172.16.211.13&quot;,
  &quot;subnet&quot;: &quot;24&quot;,
  &quot;gateway&quot;: &quot;172.16.211.2&quot;,
  &quot;dns1&quot;: &quot;172.168.211.100&quot;,
  &quot;dns2&quot;: &quot;8.8.8.8&quot;,
  &quot;domain&quot;: &quot;dev.geekcoding101local.com&quot;,
  &quot;ansible_key_path&quot;: &quot;~/.ssh/ansible_ed25519&quot;,
  &quot;ssh_key_path&quot;: &quot;~/.ssh/ssh_ed25519&quot;
}

❯ cat /opt/share_tools/init_data/k8s-4_vm_input.json
{
  &quot;hostname&quot;: &quot;k8s-4&quot;,
  &quot;ip&quot;: &quot;172.16.211.14&quot;,
  &quot;subnet&quot;: &quot;24&quot;,
  &quot;gateway&quot;: &quot;172.16.211.2&quot;,
  &quot;dns1&quot;: &quot;172.168.211.100&quot;,
  &quot;dns2&quot;: &quot;8.8.8.8&quot;,
  &quot;domain&quot;: &quot;dev.geekcoding101local.com&quot;,
  &quot;ansible_key_path&quot;: &quot;~/.ssh/ansible_ed25519&quot;,
  &quot;ssh_key_path&quot;: &quot;~/.ssh/ssh_ed25519&quot;
}

❯ cat /opt/share_tools/init_data/k8s-5_vm_input.json
{
  &quot;hostname&quot;: &quot;k8s-5&quot;,
  &quot;ip&quot;: &quot;172.16.211.15&quot;,
  &quot;subnet&quot;: &quot;24&quot;,
  &quot;gateway&quot;: &quot;172.16.211.2&quot;,
  &quot;dns1&quot;: &quot;172.168.211.100&quot;,
  &quot;dns2&quot;: &quot;8.8.8.8&quot;,
  &quot;domain&quot;: &quot;dev.geekcoding101local.com&quot;,
  &quot;ansible_key_path&quot;: &quot;~/.ssh/ansible_ed25519&quot;,
  &quot;ssh_key_path&quot;: &quot;~/.ssh/ssh_ed25519&quot;
}

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then run nslookup and ping command to make sure network has no problem in this Kubernetes cluster setup.&lt;/p&gt;
&lt;p&gt;For example, when first time start &lt;code&gt;k8s-2&lt;/code&gt;, you will see it&apos;s using &lt;code&gt;k8s-1&lt;/code&gt; as hostname and its IP:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./first-time-start-k8s-2.webp&quot; alt=&quot;first time start k8s 2&quot; /&gt;After perform the ansible script, logout and log in again as root:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./k8s-2-ready.webp&quot; alt=&quot;k8s-2 is ready&quot; /&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;h1&gt;Setup Kubernetes Master Node k8s-1&lt;/h1&gt;
&lt;p&gt;I know it&apos;s kind of unbelievable that we have prepared so much for this Kubernetes cluster setup but the actual steps to form the master and join worker nodes are just two or three commands...&lt;/p&gt;
&lt;h2&gt;Run sudo kubeadm init on Master Node&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --service-cidr=10.96.0.0/12
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;:::info
&lt;code&gt;10.244.0.0/16&lt;/code&gt; is required by &lt;a href=&quot;https://github.com/flannel-io/flannel&quot;&gt;Flannel&lt;/a&gt; which is the &lt;a href=&quot;https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/network-plugins/&quot;&gt;CNI plugin&lt;/a&gt; I am going to use in this Kubernetes cluster setup.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;--service-cidr&lt;/code&gt; flag in &lt;code&gt;kubeadm init&lt;/code&gt; &lt;strong&gt;defines the virtual IP range&lt;/strong&gt; for &lt;strong&gt;Kubernetes services&lt;/strong&gt; (ClusterIP services). This CIDR block is &lt;strong&gt;used by kube-proxy and the cluster DNS for internal service discovery&lt;/strong&gt;. Typically, you can specify &lt;strong&gt;any private IP range&lt;/strong&gt; that &lt;strong&gt;does&lt;/strong&gt; &lt;strong&gt;not overlap&lt;/strong&gt; with:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;--pod-network-cidr&lt;/code&gt; (e.g., &lt;code&gt;10.244.0.0/16&lt;/code&gt; for Flannel)&lt;/li&gt;
&lt;li&gt;Any &lt;strong&gt;physical&lt;/strong&gt; or &lt;strong&gt;existing&lt;/strong&gt; network in your infrastructure.
:::&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;:::warning
You might hit the following errors in Kubernetes cluster setup:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;validate CRI v1 image API for endpoint &quot;unix:///var/run/containerd/containerd.sock&quot;: rpc error: code = Unimplemented desc = unknown service runtime.v1.ImageService

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Or:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[ERROR CRI]: container runtime is not running: output: time=&quot;2025-03-08T11:11:00-08:00&quot; level=fatal msg=&quot;validate service connection: validate CRI v1 runtime API for endpoint \&quot;unix:///var/run/containerd/containerd.sock\&quot;: rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService&quot;
, error: exit status 1

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;The solution:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Source: &lt;a href=&quot;https://forum.linuxfoundation.org/discussion/862825/kubeadm-init-error-cri-v1-runtime-api-is-not-implemented&quot;&gt;Linux Foundation Forum&lt;/a&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;cd ~
mkdir bak
sudo cp /etc/containerd/config.toml ./bak
sudo rm -fr /etc/containerd/config.toml
sudo systemctl restart containerd

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;After this, it should start working!
:::&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --service-cidr=10.96.0.0/12
[sudo] password for admin:
W0308 22:24:40.483800  123896 version.go:104] could not fetch a Kubernetes version from the internet: unable to get URL &quot;https://dl.k8s.io/release/stable-1.txt&quot;: Get &quot;https://cdn.dl.k8s.io/release/stable-1.txt&quot;: context deadline exceeded (Client.Timeout exceeded while awaiting headers)
W0308 22:24:40.483994  123896 version.go:105] falling back to the local client version: v1.29.14
[init] Using Kubernetes version: v1.29.14
[preflight] Running pre-flight checks
        [WARNING Firewalld]: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using &apos;kubeadm config images pull&apos;
W0308 22:24:40.834473  123896 checks.go:835] detected that the sandbox image &quot;registry.k8s.io/pause:3.8&quot; of the container runtime is inconsistent with that used by kubeadm. It is recommended that using &quot;registry.k8s.io/pause:3.9&quot; as the CRI sandbox image.
[certs] Using certificateDir folder &quot;/etc/kubernetes/pki&quot;
[certs] Generating &quot;ca&quot; certificate and key
[certs] Generating &quot;apiserver&quot; certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 172.16.211.11]
[certs] Generating &quot;apiserver-kubelet-client&quot; certificate and key
[certs] Generating &quot;front-proxy-ca&quot; certificate and key
[certs] Generating &quot;front-proxy-client&quot; certificate and key
[certs] Generating &quot;etcd/ca&quot; certificate and key
[certs] Generating &quot;etcd/server&quot; certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-1 localhost] and IPs [172.16.211.11 127.0.0.1 ::1]
[certs] Generating &quot;etcd/peer&quot; certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-1 localhost] and IPs [172.16.211.11 127.0.0.1 ::1]
[certs] Generating &quot;etcd/healthcheck-client&quot; certificate and key
[certs] Generating &quot;apiserver-etcd-client&quot; certificate and key
[certs] Generating &quot;sa&quot; key and public key
[kubeconfig] Using kubeconfig folder &quot;/etc/kubernetes&quot;
[kubeconfig] Writing &quot;admin.conf&quot; kubeconfig file
[kubeconfig] Writing &quot;super-admin.conf&quot; kubeconfig file
[kubeconfig] Writing &quot;kubelet.conf&quot; kubeconfig file
[kubeconfig] Writing &quot;controller-manager.conf&quot; kubeconfig file
[kubeconfig] Writing &quot;scheduler.conf&quot; kubeconfig file
[etcd] Creating static Pod manifest for local etcd in &quot;/etc/kubernetes/manifests&quot;
[control-plane] Using manifest folder &quot;/etc/kubernetes/manifests&quot;
[control-plane] Creating static Pod manifest for &quot;kube-apiserver&quot;
[control-plane] Creating static Pod manifest for &quot;kube-controller-manager&quot;
[control-plane] Creating static Pod manifest for &quot;kube-scheduler&quot;
[kubelet-start] Writing kubelet environment file with flags to file &quot;/var/lib/kubelet/kubeadm-flags.env&quot;
[kubelet-start] Writing kubelet configuration to file &quot;/var/lib/kubelet/config.yaml&quot;
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory &quot;/etc/kubernetes/manifests&quot;. This can take up to 4m0s
[apiclient] All control plane components are healthy after 34.003550 seconds
[upload-config] Storing the configuration used in ConfigMap &quot;kubeadm-config&quot; in the &quot;kube-system&quot; Namespace
[kubelet] Creating a ConfigMap &quot;kubelet-config&quot; in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node k8s-1 as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node k8s-1 as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: yjfem7.na3i596dag4eogh9
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the &quot;cluster-info&quot; ConfigMap in the &quot;kube-public&quot; namespace
[kubelet-finalize] Updating &quot;/etc/kubernetes/kubelet.conf&quot; to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run &quot;kubectl apply -f [podnetwork].yaml&quot; with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 172.16.211.11:6443 --token yjfem7.na3i596dag4eogh9 \
        --discovery-token-ca-cert-hash sha256:23622f60b6274309294e1693439cd9a5e897c4037baaa62d5980a64745445cac

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Since we&apos;re not root, perform the steps mentioned in above in your Kubernetes cluster setup:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
&lt;/code&gt;&lt;/pre&gt;
&lt;hr /&gt;
&lt;h2&gt;Verify Kubernetes Cluster Status&lt;/h2&gt;
&lt;p&gt;Check the cluster nodes:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;kubectl get nodes -o wide
kubectl get pods -n kube-system

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You might need to wait some minutes to see all services running, here you go:&lt;img src=&quot;./check-kubernetes-status-after-init.webp&quot; alt=&quot;check kubernetes status after init&quot; /&gt;&lt;/p&gt;
&lt;p&gt;As you see, we have pods:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Kube-apiserver&lt;/li&gt;
&lt;li&gt;Kube-controller-manager&lt;/li&gt;
&lt;li&gt;Kube-scheduler&lt;/li&gt;
&lt;li&gt;Etcd&lt;/li&gt;
&lt;li&gt;Kube-proxy&lt;/li&gt;
&lt;li&gt;CoreDNS&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Test basic Kubernetes commands:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;kubectl cluster-info
kubectl get namespaces

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./cluster-info-1.webp&quot; alt=&quot;cluster info output&quot; /&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Deploy Flannel for Pod Networking&lt;/h2&gt;
&lt;p&gt;Are you exicited? We&apos;re almost there to get ourKubernetes cluster setup ready!&lt;/p&gt;
&lt;p&gt;Okay, &lt;a href=&quot;https://github.com/flannel-io/flannel&quot;&gt;Flannel&lt;/a&gt; must be installed for pods communication:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;kubectl --insecure-skip-tls-verify apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;There are alternatives CNI we can choose for Kubernetes cluster setup, here I chose Flannel because it is simple and best for lightweight networking in small to medium Kubernetes cluster setup. It supports VXLAN, host-gw, or other simple encapsulation methods.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://docs.tigera.io/calico/latest/about/&quot;&gt;Calico&lt;/a&gt; has better performance and security policies support in cloud-native environments, but it&apos;s more complex to set up than Flannel.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://github.com/weaveworks/weave&quot;&gt;Weave Net&lt;/a&gt; is actually at a similar position as Flannel. It also supports Built-in network encryption which Flannel doesn&apos;t offer.&lt;/p&gt;
&lt;p&gt;But anyway, let&apos;s focus on Flannel for now, we can explore other options in this Kubernetes cluster setup blog series later.&lt;/p&gt;
&lt;hr /&gt;
&lt;h1&gt;Set Up Worker Nodes&lt;/h1&gt;
&lt;p&gt;On each worker node &lt;code&gt;k8s-2&lt;/code&gt;, &lt;code&gt;k8s-3&lt;/code&gt;, &lt;code&gt;k8s-4&lt;/code&gt; and &lt;code&gt;k8s-5&lt;/code&gt; run as non-root account:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;sudo kubeadm join 172.16.211.11:6443 --token yjfem7.na3i596dag4eogh9 \ --discovery-token-ca-cert-hash sha256:23622f60b6274309294e1693439cd9a5e897c4037baaa62d5980a64745445cac
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you need to regenerate the above join command in our Kubernetes cluster setup, go to master node as admin account:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;sudo kubeadm token create --print-join-command

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;For example, the screenshot from &lt;code&gt;k8s-2&lt;/code&gt; in my Kubernetes cluster setup:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./k8s-2-joined-cluster.webp&quot; alt=&quot;k8s-2 joined cluster&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Hold tight! This is our last step to finish theKubernetes cluster setup!&lt;/p&gt;
&lt;hr /&gt;
&lt;h1&gt;Final Steps&lt;/h1&gt;
&lt;p&gt;Now, let&apos;s verify our Kubernetes cluster setup:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;kubectl get nodes -o wide

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If everything is set up correctly, all nodes should be in a &lt;code&gt;Ready&lt;/code&gt; state. 🎉&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./kubernetes-cluster-formed-fully.webp&quot; alt=&quot;kubernetes cluster setup formed fully&quot; /&gt;&lt;/p&gt;
&lt;p&gt;We did!!! That’s it for this post!&lt;/p&gt;
&lt;p&gt;Remember at the beginning of this post that we&apos;ve observed erros in &lt;code&gt;systemctl status kubelet&lt;/code&gt;, check it again:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./kubelet-service-status.webp&quot; alt=&quot;kubelet service status is good now&quot; /&gt;&lt;/p&gt;
&lt;p&gt;In the next section, I will explore &lt;code&gt;NodePort&lt;/code&gt; and &lt;code&gt;ClusterIP&lt;/code&gt; Kubernetes services with a Nginx pod into this Kubernetes cluster setup and deep dive into Flannel troubleshooting!&lt;/p&gt;
&lt;p&gt;Stay tuned! 🚀&lt;/p&gt;
&lt;p&gt;:::info
You&apos;re on a roll! Don&apos;t stop now—check out the full series and level up your Kubernetes skills. Each post builds on the last, so make sure you haven’t missed anything! 👇&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/kubernetes-tutorial-part1&quot;&gt;Part 1&lt;/a&gt;&lt;/strong&gt;, I laid out the &lt;strong&gt;networking plan&lt;/strong&gt;, my &lt;strong&gt;goals for setting up Kubernetes&lt;/strong&gt;, and how to &lt;strong&gt;prepare a base VM image&lt;/strong&gt; for the cluster.&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/tutorial-part2-dns-server-ntp&quot;&gt;Part 2&lt;/a&gt;&lt;/strong&gt;, I walked through &lt;strong&gt;configuring a local DNS server and NTP server&lt;/strong&gt;, essential for stable name resolution and time synchronization across nodes locally. These foundational steps will make our Kubernetes setup smoother&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/part3-kubernetes-cluster-setup&quot;&gt;Part 3&lt;/a&gt;&lt;/strong&gt;, current post.&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/part3-kubernetes-cluster-setup&quot;&gt;Part 4&lt;/a&gt;&lt;/strong&gt;, I explored &lt;code&gt;NodePort&lt;/code&gt; and &lt;code&gt;ClusterIP&lt;/code&gt;,understood the key differences, use cases, and when to choose each for internal and external service access!🔥&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/externalname-loadbalancer-5&quot;&gt;Part 5&lt;/a&gt;&lt;/strong&gt;, explored how to use externalName and LoadBalancer and how to run load testing with tool &lt;code&gt;hey&lt;/code&gt;.
:::&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>NodePort vs ClusterIP - Ultimate Kubernetes Tutorial Part 4</title><link>https://geekcoding101.com/posts/part-4-nodeport-vs-clusterip</link><guid isPermaLink="true">https://geekcoding101.com/posts/part-4-nodeport-vs-clusterip</guid><pubDate>Sat, 15 Mar 2025 00:00:00 GMT</pubDate><content:encoded>&lt;h1&gt;Introduction&lt;/h1&gt;
&lt;p&gt;Hey, welcome back to my &lt;a href=&quot;http://localhost:4321/tags/kubernetes&quot;&gt;ultimate Kubernetes tutorials&lt;/a&gt;! Now that our 1 master + 4 worker node cluster is up and running, it’s time to dive into &lt;code&gt;NodePort&lt;/code&gt; vs. &lt;code&gt;ClusterIP&lt;/code&gt;—two key service types in Kubernetes. Services act as the traffic controllers of your cluster, making sure pods can communicate reliably. Without them, your pods would be like isolated islands, unable to connect in a structured way. Pods are ephemeral, constantly changing IPs. That’s where &lt;a href=&quot;https://kubernetes.io/docs/concepts/services-networking/service/&quot;&gt;Kubernetes services&lt;/a&gt; step in—ensuring stable access, whether for internal pod-to-pod networking or external exposure. Let’s break down how they work and when to use each! 🚀&lt;/p&gt;
&lt;p&gt;Before we start, here comes a quick summary for common Four Kubernetes services:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Service Type&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Use Case&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ClusterIP&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Exposes the service internally within the cluster. No external access.&lt;/td&gt;
&lt;td&gt;Internal microservices that only communicate within Kubernetes.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;NodePort&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Exposes the service on a static port on each node&apos;s IP, making it accessible externally.&lt;/td&gt;
&lt;td&gt;Basic external access without a LoadBalancer.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;LoadBalancer&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Creates an external load balancer that directs traffic to the service.&lt;/td&gt;
&lt;td&gt;Production environments requiring automated load balancing.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ExternalName&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Maps a Kubernetes Service to an external DNS name instead of forwarding traffic.&lt;/td&gt;
&lt;td&gt;Redirecting traffic to external services outside the cluster.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;Ps.&lt;/strong&gt; Headless Service is also a Kubernetes Service type, but it behaves differently from the usual four.&lt;/p&gt;
&lt;p&gt;In this post, I will guide you to:&lt;/p&gt;
&lt;p&gt;✅ Create an &lt;strong&gt;&lt;a href=&quot;https://hub.docker.com/_/nginx&quot;&gt;Nginx&lt;/a&gt; deployment&lt;/strong&gt; running on a &lt;strong&gt;single node&lt;/strong&gt;&lt;br /&gt;
✅ Expose it using a &lt;strong&gt;NodePort Service&lt;/strong&gt;&lt;br /&gt;
✅ Verify accessibility inside and outside the cluster ✅ Expose it using a &lt;strong&gt;ClusterIP Service&lt;/strong&gt; ✅ Verify accessibility inside and outside the cluster ✅ Run a comparison between &lt;strong&gt;ClusterIP Service&lt;/strong&gt; and &lt;strong&gt;NodePort Service&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Let’s get started! 🚀&lt;/p&gt;
&lt;h1&gt;Deploying Nginx on a Single Node&lt;/h1&gt;
&lt;p&gt;Let&apos;s create a simple Kubernetes deployment with Nginx running on a single node.&lt;/p&gt;
&lt;p&gt;:::info
If not specifically saying, all commands will be performed on &lt;code&gt;k8s-1&lt;/code&gt; as &lt;code&gt;admin&lt;/code&gt; account.
:::&lt;/p&gt;
&lt;h2&gt;Create a Testing Namespace&lt;/h2&gt;
&lt;p&gt;Namespaces in Kubernetes are like &lt;strong&gt;virtual clusters within your cluster&lt;/strong&gt;, helping you organize and isolate resources. By creating a &lt;strong&gt;testing&lt;/strong&gt; namespace, we keep our deployment separate from the &lt;strong&gt;default namespace&lt;/strong&gt;, preventing conflicts with existing workloads and making cleanup easier. This way, when we&apos;re done experimenting, we can simply delete the namespace, wiping out everything inside it—no need to manually remove individual resources.&lt;/p&gt;
&lt;p&gt;:::info
If not creating one, the new deployment will use the default namespace.
:::&lt;/p&gt;
&lt;p&gt;List our existing namespaces:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;kubectl get namespaces -o wide
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./default-namespaces-list.webp&quot; alt=&quot;default namespaces list&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Check the current default namespace:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;kubectl config get-contexts

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If the &lt;code&gt;NAMESPACE&lt;/code&gt; column is empty, it means the namespace is set to &lt;strong&gt;default&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./current-namespace-check.webp&quot; alt=&quot;check current namespace&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;Let&apos;s create our namespace &lt;code&gt;service-type-test&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;kubectl create namespace service-type-test

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./added-new-namespaces.webp&quot; alt=&quot;added new namespaces &amp;quot;service-type-test&amp;quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Set our new namespace &lt;code&gt;service-type-test&lt;/code&gt; as the default:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;kubectl config set-context --current --namespace=service-type-test

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now, all &lt;code&gt;kubectl&lt;/code&gt; commands (&lt;strong&gt;under current account session&lt;/strong&gt;) will default to this namespace unless another is explicitly specified.  If you login as &lt;code&gt;root&lt;/code&gt;, then you need to perform the same again to get the convinience.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./set-new-namespace-as-default.webp&quot; alt=&quot;set new namespace as default&quot; /&gt;&lt;/p&gt;
&lt;p&gt;You can also verify the change by below command:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;kubectl config view --minify --output &apos;jsonpath={..namespace}&apos;

&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Create a Deployment YAML&lt;/h2&gt;
&lt;p&gt;Now let&apos;s create a deployment file:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;cd
mkdir nginx-deployment
vim nginx-deployment/nginx-single-node.yaml

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Paste the following YAML:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-single
  namespace: service-type-test
  labels:
    app: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      nodeSelector:
        kubernetes.io/hostname: k8s-2
      containers:
      - name: nginx
        image: nginx:latest
        ports:
        - containerPort: 80

&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Apply the Deployment&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;kubectl apply -f nginx-deployment/nginx-single-node.yaml

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Check if the pod is running:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;kubectl get pods -o wide

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;Example Output:&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ kubectl get pods -o wide
NAME                           READY   STATUS    RESTARTS   AGE    IP           NODE    NOMINATED NODE   READINESS GATES
nginx-single-7dfff5577-2v25s   1/1     Running   0          117s   10.244.1.2   k8s-2   &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;

&lt;/code&gt;&lt;/pre&gt;
&lt;h1&gt;Testing Nginx from Inside the Pod&lt;/h1&gt;
&lt;p&gt;At this moment, there is no external access. You must log into the pod or create a temporary test pod.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;kubectl exec -it nginx-single-7dfff5577-2v25s -- sh

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Inside the pod:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# curl http://10.244.1.2

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Example Output:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./check-nginx-in-pod.webp&quot; alt=&quot;check nginx in pod&quot; /&gt;&lt;/p&gt;
&lt;p&gt;So this test method has obvisou cons, it is only valid inside the Nginx pod (not cluster-wide). Doesn&apos;t verify network policies, DNS, or service discovery for external access.&lt;/p&gt;
&lt;h1&gt;Testing Nginx By Creating a Temporary Pod&lt;/h1&gt;
&lt;p&gt;With this method, we can test networking from a different pod (simulating real application behavior). It ensures DNS resolution and Service discovery work correctly and it&apos;s stateless and temporary (deleted after exit).&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;kubectl run testpod --rm -it --image=rockylinux:9 -- bash

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Once it&apos;s running, we need to install several commands, e.g. ping, nslookup:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;dnf install -y iputils net-tools nc traceroute bind-utils iproute
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Troubleshooting Pod&apos;s Internet Access Issue&lt;/h2&gt;
&lt;p&gt;You might hit internet access issue in above command in the testpod:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;dnf install -y iputils net-tools nc traceroute bind-utils iproute
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./internet-access-issue-in-pod.webp&quot; alt=&quot;internet access issue in pod&quot; /&gt;&lt;/p&gt;
&lt;p&gt;The problem is in Kubernetes&apos; DNS settings.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;kubectl edit configmap -n kube-system coredns
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let&apos;s add &lt;code&gt;8.8.8.8&lt;/code&gt; and &lt;code&gt;1.1.1.1&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;:::info
&lt;code&gt;1.1.1.1&lt;/code&gt; is Cloudflare’s public DNS resolver. &lt;code&gt;8.8.8.8&lt;/code&gt; is public IP addresses for Google&apos;s primary DNS servers.
:::&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./add-dns.webp&quot; alt=&quot;Add google and cloudflare DNS into Kubernetes&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Then delete existing CoreDNS pods and then they will be re-created with latest settings:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;kubectl delete pod -n kube-system -l k8s-app=kube-dns
kubectl get pods -n kube-system -l k8s-app=kube-dns
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./delete-dns-pods-and-check-recreation.webp&quot; alt=&quot;delete CoreDNS pods and check recreation&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Then we can immediately check the &lt;code&gt;dnf&lt;/code&gt; command in &lt;code&gt;testpod&lt;/code&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./dnf-started-working.webp&quot; alt=&quot;dnf command started working&quot; /&gt;&lt;/p&gt;
&lt;h2&gt;Troubleshooting Pod&apos;s Communication Issue&lt;/h2&gt;
&lt;p&gt;Check the nodes running pods:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ kubectl get pods -n service-type-test -o wide

NAME                           READY   STATUS    RESTARTS      AGE     IP           NODE    NOMINATED NODE   READINESS GATES
nginx-single-7dfff5577-2v25s   1/1     Running   2 (46h ago)   3d23h   10.244.1.4   k8s-2   &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
testpod                        1/1     Running   0             46h     10.244.2.5   k8s-3   &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now we have ping command in &lt;code&gt;testpod&lt;/code&gt;, test ping from &lt;code&gt;testpod&lt;/code&gt; to the &lt;code&gt;nginx-single-7dfff5577-2v25s&lt;/code&gt; pod, you might see:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[root@testpod /]# ping 10.244.1.4
PING 10.244.1.4 (10.244.1.4) 56(84) bytes of data.
From 10.244.1.0 icmp_seq=1 Packet filtered
From 10.244.1.0 icmp_seq=2 Packet filtered
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Typically, seeing &lt;code&gt;Packet filtered&lt;/code&gt; is caused by &lt;code&gt;firewalld rules&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Let&apos;s first understand the network on &lt;code&gt;k8s-2&lt;/code&gt; worker node which is running &lt;code&gt;nginx-single-7dfff5577-2v25s&lt;/code&gt; pod. Because when troubleshooting Kubernetes networking, one of the first things I always check is the network interfaces on my worker node. Why? Because understanding the network layout is crucial—it tells me how traffic flows within the node, and between nodes.&lt;/p&gt;
&lt;p&gt;Running &lt;code&gt;ip a&lt;/code&gt; gives a snapshot of all active network interfaces, and here’s what I see on my worker node:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ ip a
1: lo: &amp;lt;LOOPBACK,UP,LOWER_UP&amp;gt; mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: ens160: &amp;lt;BROADCAST,MULTICAST,UP,LOWER_UP&amp;gt; mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:0c:29:56:d2:2d brd ff:ff:ff:ff:ff:ff
    altname enp3s0
    inet 172.16.211.12/24 brd 172.16.211.255 scope global noprefixroute ens160
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:fe56:d22d/64 scope link noprefixroute
       valid_lft forever preferred_lft forever
3: ens224: &amp;lt;BROADCAST,MULTICAST,UP,LOWER_UP&amp;gt; mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:0c:29:56:d2:37 brd ff:ff:ff:ff:ff:ff
    altname enp19s0
    inet 172.16.68.135/24 brd 172.16.68.255 scope global dynamic noprefixroute ens224
       valid_lft 1419sec preferred_lft 1419sec
    inet6 fe80::2a79:5bce:ed76:fa4c/64 scope link noprefixroute
       valid_lft forever preferred_lft forever
4: docker0: &amp;lt;NO-CARRIER,BROADCAST,MULTICAST,UP&amp;gt; mtu 1500 qdisc noqueue state DOWN group default
    link/ether 7e:e6:b8:4c:23:64 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
5: flannel.1: &amp;lt;BROADCAST,MULTICAST,UP,LOWER_UP&amp;gt; mtu 1450 qdisc noqueue state UNKNOWN group default
    link/ether 06:8c:32:9e:aa:fc brd ff:ff:ff:ff:ff:ff
    inet 10.244.1.0/32 scope global flannel.1
       valid_lft forever preferred_lft forever
    inet6 fe80::48c:32ff:fe9e:aafc/64 scope link
       valid_lft forever preferred_lft forever
6: cni0: &amp;lt;BROADCAST,MULTICAST,UP,LOWER_UP&amp;gt; mtu 1450 qdisc noqueue state UP group default qlen 1000
    link/ether 2a:f2:2f:f3:e1:ac brd ff:ff:ff:ff:ff:ff
    inet 10.244.1.1/24 brd 10.244.1.255 scope global cni0
       valid_lft forever preferred_lft forever
    inet6 fe80::28f2:2fff:fef3:e1ac/64 scope link
       valid_lft forever preferred_lft forever
7: vethf4df6ba9@if2: &amp;lt;BROADCAST,MULTICAST,UP,LOWER_UP&amp;gt; mtu 1450 qdisc noqueue master cni0 state UP group default
    link/ether 36:57:42:b1:e7:6f brd ff:ff:ff:ff:ff:ff link-netns cni-224ac32a-95b2-1ef2-b716-e1230f1e1296
    inet6 fe80::3457:42ff:feb1:e76f/64 scope link
       valid_lft forever preferred_lft forever

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;To save your time, let&apos;s create a table:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Interface&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;lo&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Loopback&quot;&gt;Loopback&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Used for internal communication within the node. Always present.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ens160&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Primary network interface&lt;/td&gt;
&lt;td&gt;Connected to &lt;code&gt;vmnet2&lt;/code&gt;, providing a private network for our VMs (&lt;code&gt;172.16.211.12/24&lt;/code&gt;).&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ens224&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Secondary network interface&lt;/td&gt;
&lt;td&gt;Connected to another network (&lt;code&gt;172.16.68.135/24&lt;/code&gt;) managed by VMFusion, &lt;strong&gt;providing external internet access&lt;/strong&gt;.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;docker0&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Docker bridge (not used for Kubernetes)&lt;/td&gt;
&lt;td&gt;Created by Docker but &lt;strong&gt;not part of Kubernetes networking&lt;/strong&gt;. Leftover from local development.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;flannel.1&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;a href=&quot;https://mvallim.github.io/kubernetes-under-the-hood/documentation/kube-flannel.html&quot;&gt;Flannel overlay network&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Handles &lt;strong&gt;pod-to-pod communication across nodes&lt;/strong&gt; (&lt;code&gt;10.244.1.0/32&lt;/code&gt;).&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;cni0&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Main CNI bridge&lt;/td&gt;
&lt;td&gt;Connects pods on this node to the Flannel overlay network (&lt;code&gt;10.244.1.1/24&lt;/code&gt;).&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;veth*&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Virtual Ethernet interfaces&lt;/td&gt;
&lt;td&gt;Bridges individual pods to the &lt;code&gt;cni0&lt;/code&gt; bridge. Created dynamically as pods start.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Then checking the &lt;code&gt;firewall-cmd --list-all&lt;/code&gt; on &lt;code&gt;k8s-2&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ sudo firewall-cmd --list-all
[sudo] password for admin:
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: ens160 ens224 flannel.1
  sources:
  services: cockpit dhcpv6-client ssh
  ports: 6443/tcp 2379-2380/tcp 10250/tcp 10251/tcp 10252/tcp 10255/tcp 5473/tcp 8472/udp 30000-32767/tcp
  protocols:
  forward: yes
  masquerade: yes
  forward-ports:
  source-ports:
  icmp-blocks:
  rich rules:

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Did you spot the problem?&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;&lt;code&gt;cni0&lt;/code&gt; is missing from the active firewalld zone&lt;/strong&gt;! This is a problem because &lt;strong&gt;&lt;code&gt;cni0&lt;/code&gt; is the main bridge interface for pod networking&lt;/strong&gt;—it connects all pods on this node to the Flannel overlay network. Without it being part of the &lt;strong&gt;public zone&lt;/strong&gt;, firewalld might be blocking traffic between pods on this worker node.&lt;/p&gt;
&lt;p&gt;We can verify it by enabling &lt;code&gt;firewalld&lt;/code&gt; logs:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;sudo firewall-cmd --set-log-denied=all
sudo firewall-cmd --reload
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then &lt;code&gt;tail -f /var/log/messages&lt;/code&gt; on &lt;code&gt;k8s-2&lt;/code&gt;, you should see:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Mar 14 21:28:44 k8s-2 kernel: filter_FWD_public_REJECT: IN=flannel.1 OUT=cni0 MAC=06:8c:32:9e:aa:fc:aa:fc:a1:4d:b8:af:08:00 SRC=10.244.2.0 DST=10.244.1.4 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=21550 DF PROTO=ICMP TYPE=8 CODE=0 ID=44547 SEQ=1
Mar 14 21:28:45 k8s-2 kernel: filter_FWD_public_REJECT: IN=flannel.1 OUT=cni0 MAC=06:8c:32:9e:aa:fc:aa:fc:a1:4d:b8:af:08:00 SRC=10.244.2.0 DST=10.244.1.4 LEN=84 TOS=0x00 PREC=0x00 TTL=62 ID=22104 DF PROTO=ICMP TYPE=8 CODE=0 ID=44547 SEQ=2

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Good! Now we can fix the problem, just add &lt;code&gt;cni0&lt;/code&gt; into the public zone!&lt;/p&gt;
&lt;p&gt;Perform this on on every node as &lt;code&gt;admin&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;sudo firewall-cmd --permanent --zone=public --add-interface=cni0
sudo firewall-cmd --reload
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then &lt;code&gt;ping&lt;/code&gt; should start working!&lt;/p&gt;
&lt;p&gt;:::warning
Worst scenario is, you might need to enale &lt;code&gt;masquerade&lt;/code&gt; manually in &lt;code&gt;firewall&lt;/code&gt; if it&apos;s a &lt;code&gt;no&lt;/code&gt; value, and you might even don&apos;t have &lt;code&gt;flannel.1&lt;/code&gt; in the public zone! &lt;strong&gt;MASQUERADE&lt;/strong&gt; is essential for &lt;strong&gt;proper packet routing&lt;/strong&gt; when using an overlay network like &lt;strong&gt;Flannel:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Pods are assigned &lt;strong&gt;virtual IPs (e.g., &lt;code&gt;10.244.x.x&lt;/code&gt;)&lt;/strong&gt; that exist &lt;strong&gt;only inside the cluster&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;These IPs are &lt;strong&gt;not directly routable&lt;/strong&gt; on the physical network.&lt;/li&gt;
&lt;li&gt;MASQUERADE ensures that packets from &lt;strong&gt;one node’s pod network&lt;/strong&gt; (&lt;code&gt;10.244.x.x&lt;/code&gt;) get translated correctly when sent to another node.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Now, let&apos;s check firewalld configuration:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;firewall-cmd --query-masquerade
firewall-cmd --get-active-zones

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./check-flannel-and-masquerade.webp&quot; alt=&quot;check flannel and masquerade&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Fix above Firewalld issues on every node as &lt;code&gt;admin&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;sudo firewall-cmd --permanent --add-masquerade
sudo firewall-cmd --permanent --zone=public --add-interface=flannel.1 
sudo systemctl reload firewalld

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now, everything in network should be good!
:::&lt;/p&gt;
&lt;p&gt;As a summary for this netowrking troubleshooting, I prepared a diagram for you:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./icmp-flowchar.png&quot; alt=&quot;ICMP flowchart in Kubernetes cluster&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;h1&gt;Exploring NodePort Service&lt;/h1&gt;
&lt;p&gt;You might ask:Can &lt;code&gt;testpod&lt;/code&gt; ping &lt;code&gt;nginx-single-7dfff5577-2v25s&lt;/code&gt; directly?&lt;/p&gt;
&lt;p&gt;No,it won&apos;t work by default.&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Kubernetes does NOT create DNS records for individual pods.&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;When you run &lt;code&gt;ping nginx-single-7dfff5577-2v25s&lt;/code&gt;, your shell tries to resolve the pod name to an IP.&lt;/li&gt;
&lt;li&gt;But there’s no built-in DNS entry for an individual pod unless a Service is created.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So, let&apos;s create a &lt;code&gt;NodePort&lt;/code&gt; service for our &lt;code&gt;Nginx&lt;/code&gt; service!&lt;/p&gt;
&lt;p&gt;On &lt;code&gt;k8s-2&lt;/code&gt; master node, create a service YAML file:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;vim nginx-deployment/nginx-nodeport-service.yaml

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Paste the following:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;apiVersion: v1
kind: Service
metadata:
  name: nginx-service
  namespace: service-type-test
spec:
  selector:
    app: nginx
  type: NodePort
  ports:
  - port: 80
    targetPort: 80
    nodePort: 30080

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Apply the service:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;kubectl apply -f nginx-service.yaml

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We should see:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./nodeport-running.webp&quot; alt=&quot;nodeport is running&quot; /&gt;&lt;/p&gt;
&lt;h2&gt;NodePort: Nginx Access Methods&lt;/h2&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;th&gt;Can Access &lt;code&gt;nginx-service&lt;/code&gt;?&lt;/th&gt;
&lt;th&gt;Method to Use&lt;/th&gt;
&lt;th&gt;Why?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;testpod (inside cluster, same namespace)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;curl http://nginx-service.service-type-test:80&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Kubernetes DNS resolves it to the service.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;testpod (inside cluster, using Pod IP directly)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;curl http://10.244.1.4:80&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Works, but not recommended (Pod IPs change).&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;testpod (inside cluster, different namespace)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;❌ No (default) ✅ Yes (if explicitly specified)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;curl http://nginx-service.service-type-test.svc.cluster.local:80&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Cross-namespace access needs full DNS.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Worker nodes (&lt;code&gt;k8s-2&lt;/code&gt;, &lt;code&gt;k8s-3&lt;/code&gt;, etc.)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;curl http://localhost:30080&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;NodePort is open on all nodes.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Worker nodes (&lt;code&gt;k8s-2&lt;/code&gt;, &lt;code&gt;k8s-3&lt;/code&gt;, etc.) using Pod IP directly&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;curl http://10.244.1.4:80&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Because Flannel automatically configures routing between nodes. If you check route on any node, you should see something like this:  &lt;code&gt;❯ ip route | grep 10.244  10.244.0.0/24 via 10.244.0.0 dev flannel.1 onlink 10.244.1.0/24 dev cni0 proto kernel scope link src 10.244.1.1 10.244.2.0/24 via 10.244.2.0 dev flannel.1 onlink 10.244.3.0/24 via 10.244.3.0 dev flannel.1 onlink 10.244.4.0/24 via 10.244.4.0 dev flannel.1 onlink &lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Master node (&lt;code&gt;k8s-1&lt;/code&gt;)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;curl http://localhost:30080&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;NodePort is open on all nodes.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Master node using Pod IP directly&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;curl http://10.244.1.4:80&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Same as worker nodes.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Laptop (VMFusion, same &lt;code&gt;vmnet2&lt;/code&gt; network as worker nodes)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;curl http://&amp;lt;any-worker-node-ip&amp;gt;:30080&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;NodePort is accessible externally.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Laptop using Pod IP directly (&lt;code&gt;curl http://10.244.1.4:80&lt;/code&gt;)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;❌ No&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;Flannel can&apos;t manage my laptop route ^^&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;:::info
The format &lt;strong&gt;&lt;code&gt;nginx-service.service-type-test&lt;/code&gt;&lt;/strong&gt; is a &lt;strong&gt;Kubernetes internal DNS name&lt;/strong&gt; that follows this structure:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&amp;lt;SERVICE_NAME&amp;gt;.&amp;lt;NAMESPACE&amp;gt;.svc.cluster.local
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;For example, when you run:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;curl http://nginx-service.service-type-test:80
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It is equivalent to:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;curl http://nginx-service.service-type-test.svc.cluster.local:80
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Kubernetes automatically creates a DNS entry for every service, so any pod inside the cluster can resolve &lt;code&gt;nginx-service.service-type-test&lt;/code&gt; to its NodePort service, which forwards traffic to the appropriate pod.
:::&lt;/p&gt;
&lt;h1&gt;Exploring ClusterIP Service&lt;/h1&gt;
&lt;p&gt;We can test ClusterIP as well, create &lt;code&gt;/home/admin/nginx-deployment/nginx-clusterip-service.yaml&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;apiVersion: v1
kind: Service
metadata:
  name: nginx-service
  namespace: service-type-test
spec:
  selector:
    app: nginx
  type: ClusterIP
  ports:
  - port: 80
    targetPort: 80

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Delete &lt;code&gt;NodePort&lt;/code&gt; and apply &lt;code&gt;ClusterIP&lt;/code&gt; service:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;kubectl delete -f /home/admin/nginx-deployment/nginx-nodeport-service.yaml
kubectl apply -f /home/admin/nginx-deployment/nginx-clusterip-service.yaml
kubectl get service -n service-type-test -o wide
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The output would looks like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ kubectl get service -n service-type-test -o wide

NAME            TYPE       CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE     SELECTOR
nginx-service   NodePort   10.101.98.109   &amp;lt;none&amp;gt;        80:30080/TCP   3h19m   app=nginx
❯ kubectl delete -f /home/admin/nginx-deployment/nginx-nodeport-service.yaml

service &quot;nginx-service&quot; deleted
❯ kubectl get service -n service-type-test -o wide

No resources found in service-type-test namespace.
❯ kubectl apply -f /home/admin/nginx-deployment/nginx-clusterip-service.yaml

service/nginx-service created
❯ kubectl get service -n service-type-test -o wide

NAME            TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE   SELECTOR
nginx-service   ClusterIP   10.101.195.219   &amp;lt;none&amp;gt;        80/TCP    3s    app=nginx

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The access method table:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;th&gt;Can Access &lt;code&gt;nginx-service&lt;/code&gt;?&lt;/th&gt;
&lt;th&gt;Method to Use&lt;/th&gt;
&lt;th&gt;Why?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;testpod (inside cluster, same namespace)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;curl http://nginx-service.service-type-test:80&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Resolves to ClusterIP &lt;code&gt;10.101.195.219&lt;/code&gt;, accessible within the cluster.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;testpod (inside cluster, using Pod IP directly)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;curl http://10.244.1.4:80&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Works as Pod IP is routable within the cluster.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;testpod (inside cluster, different namespace)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;curl http://nginx-service.service-type-test.svc.cluster.local:80&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;DNS resolves to ClusterIP, accessible within the cluster.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Worker nodes (&lt;code&gt;k8s-2&lt;/code&gt;, &lt;code&gt;k8s-3&lt;/code&gt;, etc.)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;curl http://10.101.195.219:80&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Because Flannel automatically configures routing.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Master node (&lt;code&gt;k8s-1&lt;/code&gt;)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;curl http://10.101.195.219:80&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;ClusterIP is accessible from within the cluster.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Laptop (VMFusion, same &lt;code&gt;vmnet2&lt;/code&gt; network as worker nodes)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;❌ No&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;ClusterIP is internal and not exposed externally.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Laptop using Pod IP directly (&lt;code&gt;curl http://10.244.1.4:80&lt;/code&gt;)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;❌ No&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;Pod IPs (&lt;code&gt;10.244.x.x&lt;/code&gt;) are not reachable from outside the cluster.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2&gt;What is a ClusterIP? Why Can a Worker Node Access the ClusterIP?&lt;/h2&gt;
&lt;p&gt;I know you might ask this. Here comes the breakdown:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1.&lt;/strong&gt; ClusterIP is a Virtual IP Managed by kube-proxy.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;10.101.195.219&lt;/code&gt; is not tied to any single pod or node—it’s a virtual IP managed by &lt;code&gt;kube-proxy&lt;/code&gt;. When a request is made to &lt;code&gt;10.101.195.219:80&lt;/code&gt;, &lt;code&gt;kube-proxy&lt;/code&gt; redirects it to one of the matching pods (&lt;code&gt;10.244.1.4:80&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2.&lt;/strong&gt; Flannel Provides Pod-to-Pod Connectivity Across Nodes&lt;/p&gt;
&lt;p&gt;I&apos;ve mentioned previously, Flannel creates an overlay network so all Pods (&lt;code&gt;10.244.x.x&lt;/code&gt;) can communicate, even across different nodes. If the Nginx pod (&lt;code&gt;10.244.1.4&lt;/code&gt;) is on a different node (&lt;code&gt;k8s-2&lt;/code&gt;) (&lt;code&gt;testpod&lt;/code&gt; is on &lt;code&gt;k8s-3&lt;/code&gt;), Flannel encapsulates the traffic and routes it through the worker nodes&apos; &lt;code&gt;ens160&lt;/code&gt; (&lt;code&gt;172.16.211.x&lt;/code&gt;) interfaces.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;3.&lt;/strong&gt; Iptables Rules Handle Traffic Routing&lt;/p&gt;
&lt;p&gt;&lt;code&gt;kube-proxy&lt;/code&gt; sets up iptables rules on each node to redirect ClusterIP traffic to the actual pod. Let&apos;s run &lt;code&gt;iptables-save | grep 10.101.195.219&lt;/code&gt;  on any node, you would see below rules forwarding traffic to &lt;code&gt;10.244.1.4&lt;/code&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./check-clusterip-iptables.webp&quot; alt=&quot;display iptables added for clusterip&quot; /&gt;&lt;/p&gt;
&lt;h1&gt;NodePort vs ClusterIP&lt;/h1&gt;
&lt;p&gt;Before we wrap up, let&apos;s take a step back and compare &lt;strong&gt;ClusterIP&lt;/strong&gt; and &lt;strong&gt;NodePort&lt;/strong&gt;, two essential service types in Kubernetes. While both enable communication within a cluster, their accessibility and use cases differ significantly. Whether you&apos;re building internal microservices or exposing an application externally, choosing the right service type is crucial.&lt;/p&gt;
&lt;p&gt;It&apos;s alwasy easy to compare two similar technologies with a comparison table. The table below summarizes their key differences to help you decide which one fits your needs best.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;ClusterIP&lt;/th&gt;
&lt;th&gt;NodePort&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Accessibility&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Only accessible within the cluster&lt;/td&gt;
&lt;td&gt;Accessible from outside the cluster via &lt;code&gt;NodeIP:NodePort&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Default Behavior&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Assigned a private IP within the cluster&lt;/td&gt;
&lt;td&gt;Exposes a service on a high-numbered port (30000-32767) on each node&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Use Case&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Internal communication between microservices&lt;/td&gt;
&lt;td&gt;External access without a LoadBalancer, typically for development/testing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;How to Access&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Via &lt;code&gt;ClusterIP&lt;/code&gt; or service name inside the cluster&lt;/td&gt;
&lt;td&gt;Via &lt;code&gt;http://:&lt;/code&gt; from clients&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Example Service YAML&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;apiVersion: v1 kind: Service metadata:   name: my-clusterip-service spec:   type: ClusterIP   selector:     app: my-app   ports:     - port: 80       targetPort: 80 &lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;apiVersion: v1 kind: Service metadata:   name: my-nodeport-service spec:   type: NodePort   selector:     app: my-app   ports:     - port: 80       targetPort: 80       nodePort: 30080 &lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Requires External Networking?&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No, works entirely within the cluster&lt;/td&gt;
&lt;td&gt;Yes, needs the node&apos;s IP to be reachable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Security Considerations&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;More secure since it&apos;s only accessible inside the cluster&lt;/td&gt;
&lt;td&gt;Less secure as it exposes a port on all nodes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h1&gt;🎉 Congratulations!&lt;/h1&gt;
&lt;p&gt;If you’ve made it this far, congrats! Our Nginx instance is now fully accessible and we&apos;ve learned NodePort and ClusterIP services, and honestly, this feels like a huge win!&lt;/p&gt;
&lt;p&gt;:::info
I originally thought about wrapping up the series here—it’s been an intense ride. I spent &lt;strong&gt;a full week&lt;/strong&gt;, squeezing every spare moment and working around the clock to tackle one of the most crucial (and driest) parts of Kubernetes. I was in that state of learning excitement, pushing through, and now that it’s finally done, I feel a &lt;strong&gt;huge sense of accomplishment… and total exhaustion&lt;/strong&gt;. But seeing everything come together is just &lt;strong&gt;too satisfying&lt;/strong&gt; to stop now. So… I’m keeping this journey going! 🚀
:::&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;Next up, I’ll be diving into other Kubernetes services—ExternalName and LoadBalancer—because why stop when there’s so much more to explore?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Stay tuned for the next post!&lt;/strong&gt; 😎🔥&lt;/p&gt;
&lt;p&gt;:::info
You&apos;re on a roll! Don&apos;t stop now—check out the full series and level up your Kubernetes skills. Each post builds on the last, so make sure you haven’t missed anything! 👇&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/kubernetes-tutorial-part1&quot;&gt;Part 1&lt;/a&gt;&lt;/strong&gt;, I laid out the &lt;strong&gt;networking plan&lt;/strong&gt;, my &lt;strong&gt;goals for setting up Kubernetes&lt;/strong&gt;, and how to &lt;strong&gt;prepare a base VM image&lt;/strong&gt; for the cluster.&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/tutorial-part2-dns-server-ntp&quot;&gt;Part 2&lt;/a&gt;&lt;/strong&gt;, I walked through &lt;strong&gt;configuring a local DNS server and NTP server&lt;/strong&gt;, essential for stable name resolution and time synchronization across nodes locally. These foundational steps will make our Kubernetes setup smoother.&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/part3-kubernetes-cluster-setup&quot;&gt;Part 3&lt;/a&gt;&lt;/strong&gt;, I finished the Kubernetes cluster setup with Flannel, got one Kubernetes master and 4 worker nodes that’s ready for real workloads 🔥&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/part3-kubernetes-cluster-setup&quot;&gt;Part 4&lt;/a&gt;&lt;/strong&gt;, current one!&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/externalname-loadbalancer-5&quot;&gt;Part 5&lt;/a&gt;&lt;/strong&gt;, explored how to use externalName and LoadBalancer and how to run load testing with tool &lt;code&gt;hey&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;:::&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>ExternalName and LoadBalancer - Ultimate Kubernetes Tutorial Part 5</title><link>https://geekcoding101.com/posts/externalname-loadbalancer-5</link><guid isPermaLink="true">https://geekcoding101.com/posts/externalname-loadbalancer-5</guid><pubDate>Tue, 18 Mar 2025 00:00:00 GMT</pubDate><content:encoded>&lt;h1&gt;Introduction&lt;/h1&gt;
&lt;p&gt;Hey, welcome back to my &lt;a href=&quot;/tags/kubernetes&quot;&gt;ultimate Kubernetes tutorials&lt;/a&gt;! So far, we&apos;ve explored &lt;a href=&quot;/posts/part-4-nodeport-vs-clusterip&quot;&gt;&lt;strong&gt;ClusterIP&lt;/strong&gt; and &lt;strong&gt;NodePort&lt;/strong&gt;&lt;/a&gt;, but what if you need to route traffic outside your cluster or expose your app with a real external IP? That’s where &lt;a href=&quot;https://www.kubecost.com/kubernetes-best-practices/kubernetes-external-service/&quot;&gt;&lt;strong&gt;ExternalName&lt;/strong&gt;&lt;/a&gt; and &lt;a href=&quot;https://www.okteto.com/blog/kubernetes-load-balancer-service/&quot;&gt;&lt;strong&gt;LoadBalancer&lt;/strong&gt;&lt;/a&gt; services come in. &lt;strong&gt;ExternalName&lt;/strong&gt; lets your pods seamlessly connect to external services using DNS, while &lt;strong&gt;LoadBalancer&lt;/strong&gt; provides a publicly accessible endpoint for your app. In this post, we’ll break down how they work, when to use them, and how to configure them in your Kubernetes cluster. Let’s dive in! 🚀&lt;/p&gt;
&lt;h1&gt;Exploring ExternalName Service&lt;/h1&gt;
&lt;p&gt;Okay, we&apos;re still in my nginx/testpod environment in namespace &lt;code&gt;service-type-test&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;/posts/part-4-nodeport-vs-clusterip&quot;&gt;In our last post&lt;/a&gt;, we have ClusterIP running, let&apos;s delete it to get a clean environment to start:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;kubectl apply -f /home/admin/nginx-deployment/nginx-clusterip-service.yaml
kubectl get service -n service-type-test -o wide
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You should not see any service is running in above output.&lt;/p&gt;
&lt;p&gt;Now, let&apos;s work on  &lt;code&gt;ExternalName&lt;/code&gt;!&lt;/p&gt;
&lt;p&gt;Creating an &lt;code&gt;ExternalName&lt;/code&gt; service is simpler than creating &lt;code&gt;NodePort&lt;/code&gt; or &lt;code&gt;ClusterIP&lt;/code&gt; , a little bit... create a file &lt;code&gt;/home/admin/nginx-deployment/nginx-externalname-service.yaml&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;apiVersion: v1
kind: Service
metadata:
  name: nginx-service
  namespace: service-type-test
spec:
  type: ExternalName
  externalName: my-nginx.external.local

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Unlike &lt;strong&gt;ClusterIP&lt;/strong&gt;, &lt;strong&gt;NodePort&lt;/strong&gt;, &lt;strong&gt;LoadBalancer&lt;/strong&gt;, or &lt;strong&gt;Headless services&lt;/strong&gt;, this service does not select backend pods. Instead, it just creates a DNS alias that redirects traffic to an external hostname. So:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;No selector needed → It does not route traffic to Kubernetes pods.&lt;/li&gt;
&lt;li&gt;No labels needed → There’s no pod matching required since it’s just a DNS pointer.&lt;/li&gt;
&lt;li&gt;It simply returns the CNAME record when queried inside the cluster.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Simpler on Kubernetes side, but more manual steps on your own side ^^&lt;/p&gt;
&lt;p&gt;I must manually configure DNS resolution for &lt;code&gt;my-nginx.external.local&lt;/code&gt; so that Kubernetes can resolve it to the correct external IP or hostname.&lt;/p&gt;
&lt;p&gt;Then, How?&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;kubectl edit configmap -n kube-system coredns
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then update it as below:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;hosts {
    172.16.211.12 my-nginx.external.local
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./update-coredns-config-with-hosts-for-externalname.webp&quot; alt=&quot;update coredns config with hosts for externalname&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Restart &lt;code&gt;CoreDNS&lt;/code&gt; pods and apply &lt;code&gt;ExternalName&lt;/code&gt; service:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;kubectl rollout restart deployment coredns -n kube-system 
kubectl get pods -n kube-system -l k8s-app=kube-dns
kubectl apply -f ./nginx-deployment/nginx-externalname-service.yaml
kubectl get services -o wide
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You should see this:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./externalname-service-running.webp&quot; alt=&quot;externalname service is running&quot; /&gt;&lt;/p&gt;
&lt;h2&gt;Tricks on Name Resolution&lt;/h2&gt;
&lt;p&gt;Now trying to resolve the name in &lt;code&gt;testpod&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[root@testpod /]# nslookup 172.16.211.12
12.211.16.172.in-addr.arpa      name = my-nginx.external.local.

[root@testpod /]# nslookup my-nginx.external.local.
Server:         10.96.0.10
Address:        10.96.0.10#53

Name:   my-nginx.external.local
Address: 172.16.211.12

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Did you notice that I put a trailing dot &lt;code&gt;.&lt;/code&gt; when running &lt;code&gt;nslookup&lt;/code&gt; on &lt;code&gt;my-nginx.external.local&lt;/code&gt; ?&lt;/p&gt;
&lt;p&gt;It&apos;s a must for current configuration. Otherwise, you will hit this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[root@testpod /]# nslookup my-nginx.external.local
Server:         10.96.0.10
Address:        10.96.0.10#53

** server can&apos;t find my-nginx.external.local.service-type-test.svc.cluster.local: SERVFAIL

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The reason is that the DNS query is being appended with the Kubernetes default search domain, so above command is equal to below:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;nslookup my-nginx.external.local.service-type-test.svc.cluster.local
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This happens because:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Inside a Kubernetes pod, DNS queries automatically append the namespace and cluster domain (e.g., &lt;code&gt;.svc.cluster.local&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Since &lt;code&gt;my-nginx.external.local&lt;/code&gt; is an absolute FQDN, CoreDNS shouldn&apos;t apply the &lt;code&gt;.svc.cluster.local&lt;/code&gt; suffix.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;One option to force a Fully Qualified Domain Name (FQDN) Query is to put a trailing dot &lt;code&gt;.&lt;/code&gt; to the DNS name ^^&lt;/p&gt;
&lt;p&gt;Then why bother to put the trailing &lt;code&gt;.&lt;/code&gt; ? Can we make lifer easier?&lt;/p&gt;
&lt;p&gt;Sure. Then just update the configmap of &lt;code&gt;CoreDNS&lt;/code&gt; to this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;hosts {
    172.16.211.12 my-nginx.external.local my-nginx.external.local.service-type-test.svc.cluster.local
}

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now both &lt;code&gt;my-nginx.external.local&lt;/code&gt; and &lt;code&gt;my-nginx.external.local.&lt;/code&gt; work!&lt;/p&gt;
&lt;h2&gt;Integrate ExternalName And ClusterIP&lt;/h2&gt;
&lt;p&gt;And then you might ask, why I give &lt;code&gt;172.16.211.12&lt;/code&gt;? Do I have to use the work node IP where is running the pod to resolve the external name?&lt;/p&gt;
&lt;p&gt;Not necessarily! You don’t have to use the exact worker node IP where the pod is running. Instead, you should configure &lt;code&gt;my-nginx.external.local&lt;/code&gt; to resolve to an IP that can correctly route traffic to the Nginx pod.&lt;/p&gt;
&lt;p&gt;One solution I used here which is also recommended is to use &lt;code&gt;ClusterIP&lt;/code&gt;!&lt;/p&gt;
&lt;p&gt;:::tip
Yes, we can have both services for our nginx service!
:::&lt;/p&gt;
&lt;p&gt;Before that, we need to delete the exisitng service to get a clean start and ensure no service is running.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;kubectl delete -f ./nginx-deployment/nginx-externalname-service.yaml
kubectl get services -o wide

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then, we need to update our yaml files, because so far both &lt;code&gt;nginx-deployment/nginx-externalname-service.yaml&lt;/code&gt; and &lt;code&gt;nginx-deployment/nginx-clusterip-service.yaml&lt;/code&gt; are using same name &lt;code&gt;nginx-service&lt;/code&gt;! In Kubernetes, a Service is uniquely identified by its name and namespace. Let&apos;s update the name a bit.&lt;/p&gt;
&lt;p&gt;I know, it&apos;s just a one line change. But let&apos;s make sure you have it correctly!&lt;/p&gt;
&lt;p&gt;&lt;code&gt;nginx-externalname-service.yaml&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;apiVersion: v1
kind: Service
metadata:
  name: nginx-external-service
  namespace: service-type-test
spec:
  type: ExternalName
  externalName: my-nginx.external.local

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;nginx-clusterip-service.yaml&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;apiVersion: v1
kind: Service
metadata:
  name: nginx-clusterip-service
  namespace: service-type-test
spec:
  selector:
    app: nginx
  type: ClusterIP
  ports:
  - port: 80
    targetPort: 80

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then perform the commands:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;kubectl apply -f nginx-deployment/nginx-clusterip-service.yaml
kubectl apply -f nginx-deployment/nginx-externalname-service.yaml
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then update &lt;code&gt;CoreDNS&lt;/code&gt; via &lt;code&gt;kubectl edit configmap -n kube-system coredns&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;hosts {
    10.98.205.55 my-nginx.external.local my-nginx.external.local.service-type-test.svc.cluster.local
    fallthrough
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Please note, I added &lt;code&gt;fallthrough&lt;/code&gt; this time. Because &lt;code&gt;ClusterIP&lt;/code&gt; will be resolved by Kubernetes plugin in &lt;code&gt;CoreDNS&lt;/code&gt; instead of &lt;code&gt;hosts&lt;/code&gt; plugin. The &lt;code&gt;fallthrough&lt;/code&gt; directive allows other DNS plugins (like &lt;code&gt;kubernetes&lt;/code&gt;) to &lt;strong&gt;continue processing&lt;/strong&gt; if the entry isn&apos;t found in &lt;code&gt;hosts&lt;/code&gt; firstly.&lt;/p&gt;
&lt;p&gt;Then run:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;kubectl rollout restart deployment coredns -n kube-system
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Finally, let&apos;s testing name resolution in &lt;code&gt;testpod&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[root@testpod /]# nslookup 10.98.205.55
55.205.98.10.in-addr.arpa       name = my-nginx.external.local.
55.205.98.10.in-addr.arpa       name = my-nginx.external.local.service-type-test.svc.cluster.local.

[root@testpod /]# nslookup my-nginx.external.local
Server:         10.96.0.10
Address:        10.96.0.10#53

Name:   my-nginx.external.local.service-type-test.svc.cluster.local
Address: 10.98.205.55

[root@testpod /]# nslookup my-nginx.external.local.
Server:         10.96.0.10
Address:        10.96.0.10#53

Name:   my-nginx.external.local
Address: 10.98.205.55

[root@testpod /]# nslookup nginx-clusterip-service.service-type-test
Server:         10.96.0.10
Address:        10.96.0.10#53

Name:   nginx-clusterip-service.service-type-test.svc.cluster.local
Address: 10.98.205.55

[root@testpod /]#

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And then test the access to our Nginx service!&lt;/p&gt;
&lt;h2&gt;The Comparison of Access Methods&lt;/h2&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;th&gt;Can Access &lt;code&gt;nginx-service&lt;/code&gt;?&lt;/th&gt;
&lt;th&gt;Method to Use&lt;/th&gt;
&lt;th&gt;Why?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;testpod (inside cluster, same namespace)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;curl http://nginx-clusterip-service.service-type-test:80&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Resolves to ClusterIP &lt;code&gt;10.98.205.55&lt;/code&gt;, accessible within the cluster.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;testpod (inside cluster, using ExternalName service)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;curl http://nginx-external-service.service-type-test:80&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;DNS resolves &lt;code&gt;my-nginx.external.local&lt;/code&gt; to &lt;code&gt;10.98.205.55&lt;/code&gt;.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;testpod (inside cluster, using Pod IP directly)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;curl http://10.244.1.4:80&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Works as Pod IP is routable within the cluster.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;testpod (inside cluster, different namespace)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;curl http://nginx-clusterip-service.service-type-test.svc.cluster.local:80&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;DNS resolves to ClusterIP, accessible within the cluster.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Worker nodes (&lt;code&gt;k8s-2&lt;/code&gt;, &lt;code&gt;k8s-3&lt;/code&gt;, etc.)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;curl http://10.98.205.55:80&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;ClusterIP is accessible from within the cluster.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Master node (&lt;code&gt;k8s-1&lt;/code&gt;)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Yes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;curl http://10.98.205.55:80&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;ClusterIP is accessible from within the cluster.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Laptop (VMFusion, same &lt;code&gt;vmnet2&lt;/code&gt; network as worker nodes)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;❌ No&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;Resolve internally within Kubernetes only.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Laptop using Pod IP directly (&lt;code&gt;curl http://10.244.1.4:80&lt;/code&gt;)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;❌ No&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;Pod IPs (&lt;code&gt;10.244.x.x&lt;/code&gt;) are not reachable from outside the cluster.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Laptop (VMFusion) using LoadBalancer (if configured)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;❌ No&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;Not provide external access for the current service.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Amazing!&lt;/p&gt;
&lt;h1&gt;Exploring LoadBalancer Service&lt;/h1&gt;
&lt;p&gt;Thanks for being with you so far! I hope you enjoy my step-by-step explaination!&lt;/p&gt;
&lt;p&gt;Now let&apos;s clean up the service and start learning &lt;code&gt;LoadBalancer&lt;/code&gt; service!&lt;/p&gt;
&lt;p&gt;Since we&apos;re running Kubernetes &lt;strong&gt;inside VMFusion&lt;/strong&gt;, there&apos;s no &lt;strong&gt;cloud provider&lt;/strong&gt; to automatically assign a LoadBalancer IP. We&apos;ll need to use &lt;a href=&quot;https://metallb.io/&quot;&gt;&lt;strong&gt;MetalLB&lt;/strong&gt;&lt;/a&gt; as a software-based LoadBalancer for our cluster.&lt;/p&gt;
&lt;h2&gt;Install MetaLB&lt;/h2&gt;
&lt;p&gt;Go to &lt;a href=&quot;https://github.com/metallb/metallb/tags&quot;&gt;tags page of MetaLB&lt;/a&gt;,  get the latest version, currently is 0.14.9. Then we can apply on our master node as &lt;code&gt;admin&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.14.9/config/manifests/metallb-native.yaml
kubectl get pods -n metalb-system
kubectl get crds | grep metalb
kubectl get svc -n metalb-system
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./metalb-system-pods-running.webp&quot; alt=&quot;metalb system pods running&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./metalb-system-crds-svc-running.webp&quot; alt=&quot;metalb system crds svc running&quot; /&gt;&lt;/p&gt;
&lt;h2&gt;Define IPAddressPool&lt;/h2&gt;
&lt;p&gt;Create file &lt;code&gt;/home/admin/nginx-deployment/metalb-ipaddresspool.yaml&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;apiVersion: metallb.io/v1beta1  # Use v1beta1 for latest MetalLB versions
kind: IPAddressPool
metadata:
  name: default-pool
  namespace: metallb-system
spec:
  addresses:
  - 172.16.211.200-172.16.211.210  # Define an IP range
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
  name: l2-adv
  namespace: metallb-system

&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;kubectl apply -f nginx-deployment/metalb-ipaddresspool.yaml
kubectl get ipaddresspools -n metallb-system
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./check-metalb-ipaddresspool.webp&quot; alt=&quot;check metalb ipaddresspool&quot; /&gt;&lt;/p&gt;
&lt;h2&gt;Create LoadBalancer Service&lt;/h2&gt;
&lt;p&gt;Create file &lt;code&gt;/home/admin/nginx-deployment/nginx-loadbalancer-service.yaml&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;apiVersion: v1
kind: Service
metadata:
  name: nginx-loadbalancer
  namespace: service-type-test
spec:
  selector:
    app: nginx
  type: LoadBalancer
  ports:
  - port: 80
    targetPort: 80

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Apply and check:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;kubectl apply -f nginx-deployment/nginx-loadbalancer-service.yaml
kubectl get svc -n service-type-test
curl http://172.16.211.200:80
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./loadbalancer-is-working.webp&quot; alt=&quot;loadbalancer is working&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Above is testing on master node &lt;code&gt;k8s-1&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Let&apos;s also test from &lt;code&gt;testpod&lt;/code&gt; and from my laptop.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./loadbalancer-is-working-in-testpod.webp&quot; alt=&quot;loadbalancer is working in-testpod&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./loadbalancer-is-working-in-laptop.webp&quot; alt=&quot;loadbalancer is working in laptop&quot; /&gt;All work!&lt;/p&gt;
&lt;h2&gt;Wait! How to Know LoadBalancer Is Balancing ?!&lt;/h2&gt;
&lt;p&gt;That&apos;s a good question!&lt;/p&gt;
&lt;p&gt;Since &lt;strong&gt;MetalLB LoadBalancer&lt;/strong&gt; operates at &lt;strong&gt;Layer 2 (default) or BGP&lt;/strong&gt;, traffic is distributed &lt;strong&gt;across multiple pods&lt;/strong&gt; behind the service. Let’s simulate and test whether MetalLB is balancing traffic.&lt;/p&gt;
&lt;p&gt;Check how many pods your LoadBalancer service is distributing traffic to:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ kubectl get endpoints -n service-type-test nginx-loadbalancer

NAME                 ENDPOINTS       AGE
nginx-loadbalancer   10.244.1.4:80   38m

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We can see it has a IP range for balance... What?! It only has one IP! Thinking...&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./8-hours-later.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Oh! Remember now! Our Nginx pod was set to run on just one node!&lt;/p&gt;
&lt;p&gt;To refresh you memory here is the file we depolied the nginx pod:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ cat nginx-deployment/nginx-single-node.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-single
  namespace: service-type-test
  labels:
    app: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      nodeSelector:
        kubernetes.io/hostname: k8s-2
      containers:
      - name: nginx
        image: nginx:latest
        ports:
        - containerPort: 80

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;To truly test if MetalLB&apos;s LoadBalancer is distributing traffic, we need multiple pods running behind the service. If only one pod is available, all incoming requests will always hit that single pod, making it impossible to observe any load balancing in action. Kubernetes distributes traffic only among pods that match the service selector, so if there’s just one, there’s nothing to balance! To fix this, we should scale the deployment to at least two or three replicas and then send multiple requests to see how they get distributed. Let’s scale it up and test again! 🚀&lt;/p&gt;
&lt;h3&gt;Update Nginx Deployment Yaml&lt;/h3&gt;
&lt;p&gt;Typically we can use &lt;code&gt;kubectl&lt;/code&gt; command to update &lt;code&gt;replicas&lt;/code&gt; , like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;kubectl scale deployment nginx-single -n service-type-test --replicas=3
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;However, as you see, our previous Nginx yaml has name &lt;code&gt;nginx-single&lt;/code&gt;, it would lead misunderstanding. Let&apos;s just delete it and recreate one with name &lt;code&gt;nginx-multiple-nodes&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ kubectl delete -f nginx-deployment/nginx-single-node.yaml
deployment.apps &quot;nginx-single&quot; deleted
❯ kubectl get pods

NAME      READY   STATUS    RESTARTS   AGE
testpod   1/1     Running   0          2d20h
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The new nginx yaml file &lt;code&gt;/home/admin/nginx-deployment/nginx-multiple-nodes.yaml&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-multiple-nodes
  namespace: service-type-test
  labels:
    app: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:latest
        ports:
        - containerPort: 80

&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;kubectl apply -f nginx-deployment/nginx-multiple-nodes.yaml
kubectl get pods -o wide
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./nginx-running-on-multiple-nodes.webp&quot; alt=&quot;nginx running on multiple nodes&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Check &lt;code&gt;kubectl get endpoints&lt;/code&gt; again:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./check-endpoints.webp&quot; alt=&quot;check endpoints again&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Now we can see three IPs assigned!&lt;/p&gt;
&lt;h3&gt;A Quick Try for Testing?&lt;/h3&gt;
&lt;p&gt;I can send multiple requests from &lt;strong&gt;my laptop&lt;/strong&gt; to the LoadBalancer IP:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;for i in {1..10}; do curl -s http://172.16.211.200 | grep &quot;Welcome&quot;; done
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If load balancing is working correctly, some responses should come from different pods.&lt;/p&gt;
&lt;p&gt;Are you kidding? How do I know? This is what I got!&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ for i in {1..10}; do curl -s http://172.16.211.200 | grep &quot;Welcome&quot;; done

&amp;lt;title&amp;gt;Welcome to nginx!&amp;lt;/title&amp;gt;
&amp;lt;h1&amp;gt;Welcome to nginx!&amp;lt;/h1&amp;gt;
&amp;lt;title&amp;gt;Welcome to nginx!&amp;lt;/title&amp;gt;
&amp;lt;h1&amp;gt;Welcome to nginx!&amp;lt;/h1&amp;gt;
&amp;lt;title&amp;gt;Welcome to nginx!&amp;lt;/title&amp;gt;
&amp;lt;h1&amp;gt;Welcome to nginx!&amp;lt;/h1&amp;gt;
&amp;lt;title&amp;gt;Welcome to nginx!&amp;lt;/title&amp;gt;
&amp;lt;h1&amp;gt;Welcome to nginx!&amp;lt;/h1&amp;gt;
&amp;lt;title&amp;gt;Welcome to nginx!&amp;lt;/title&amp;gt;
&amp;lt;h1&amp;gt;Welcome to nginx!&amp;lt;/h1&amp;gt;
&amp;lt;title&amp;gt;Welcome to nginx!&amp;lt;/title&amp;gt;
&amp;lt;h1&amp;gt;Welcome to nginx!&amp;lt;/h1&amp;gt;
&amp;lt;title&amp;gt;Welcome to nginx!&amp;lt;/title&amp;gt;
&amp;lt;h1&amp;gt;Welcome to nginx!&amp;lt;/h1&amp;gt;
&amp;lt;title&amp;gt;Welcome to nginx!&amp;lt;/title&amp;gt;
&amp;lt;h1&amp;gt;Welcome to nginx!&amp;lt;/h1&amp;gt;
&amp;lt;title&amp;gt;Welcome to nginx!&amp;lt;/title&amp;gt;
&amp;lt;h1&amp;gt;Welcome to nginx!&amp;lt;/h1&amp;gt;
&amp;lt;title&amp;gt;Welcome to nginx!&amp;lt;/title&amp;gt;
&amp;lt;h1&amp;gt;Welcome to nginx!&amp;lt;/h1&amp;gt;

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I see.&lt;/p&gt;
&lt;h3&gt;Customize Nginx default.conf via ConfigMap&lt;/h3&gt;
&lt;p&gt;We can update Nginx &lt;code&gt;default.conf&lt;/code&gt; to make it add a header holding the server&apos;s &lt;code&gt;hostname&lt;/code&gt; in its response for every request, so we should see different values in response&apos; header that telling MetaLB is working well.&lt;/p&gt;
&lt;p&gt;We don&apos;t need to rebuild Nginx image to have the custom &lt;code&gt;default.conf&lt;/code&gt;, we can just use &lt;code&gt;configmap&lt;/code&gt; and mount it into our Nginx pod, then we can see response!&lt;/p&gt;
&lt;p&gt;You might ask, but what is &lt;code&gt;ConfigMap&lt;/code&gt; and why is that? If we want to update Nginx config file, shoudn&apos;t we log into the pod and update the file manually?&lt;/p&gt;
&lt;p&gt;Ah ha! Gotcha!&lt;/p&gt;
&lt;p&gt;You&apos;re right—on a traditional Linux server, you&apos;d SSH in, modify &lt;code&gt;/etc/nginx/nginx.conf&lt;/code&gt;, and restart Nginx. But in Kubernetes, there&apos;s a more scalable and automated way to manage configurations - That is &lt;code&gt;ConfigMap&lt;/code&gt;.&lt;/p&gt;
&lt;h4&gt;&lt;strong&gt;What Exactly Is a ConfigMap in Kubernetes?&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Alright, think of a &lt;strong&gt;ConfigMap&lt;/strong&gt; as &lt;strong&gt;a central place to store our application&apos;s configuration&lt;/strong&gt;—instead of hardcoding settings inside our container image, we define them separately and let Kubernetes inject them when needed.&lt;/p&gt;
&lt;h5&gt;&lt;strong&gt;Why Does This Matter?&lt;/strong&gt;&lt;/h5&gt;
&lt;p&gt;Imagine running the same application in different environments (development, testing, production). We wouldn’t want to rebuild container image every time just to change a database URL, an API key, or a logging level. Instead, we store these settings in a ConfigMap, and our pods pull the configuration dynamically at runtime.&lt;/p&gt;
&lt;hr /&gt;
&lt;h5&gt;&lt;strong&gt;How Does a ConfigMap Work?&lt;/strong&gt;&lt;/h5&gt;
&lt;p&gt;A &lt;strong&gt;ConfigMap&lt;/strong&gt; in Kubernetes can store:&lt;br /&gt;
✅ Key-value pairs (like environment variables)&lt;br /&gt;
✅ Entire configuration files&lt;br /&gt;
✅ Command-line arguments&lt;/p&gt;
&lt;p&gt;Once created, we can &lt;strong&gt;store ConfigMaps into Kubernetes, and it can insert ConfigMap into pods&lt;/strong&gt; as:&lt;br /&gt;
🔹 &lt;strong&gt;Environment variables&lt;/strong&gt;&lt;br /&gt;
🔹 &lt;strong&gt;Mounted files (as volumes)&lt;/strong&gt;&lt;br /&gt;
🔹 &lt;strong&gt;Command-line arguments&lt;/strong&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;h4&gt;&lt;strong&gt;Our case: Storing an Nginx Config in ConfigMap&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;Instead of modifying the Nginx image&apos;s &lt;code&gt;default.conf&lt;/code&gt; manually inside a pod (which would get lost after a restart), we create a &lt;strong&gt;ConfigMap&lt;/strong&gt; at &lt;code&gt;/home/admin/nginx-deployment/nginx-config.yaml&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;apiVersion: v1
kind: ConfigMap
metadata:
  name: nginx-config
  namespace: service-type-test
data:
  default.conf: |
    server {
      listen 80;
      location / {
        add_header X-Served-By $hostname;
        root /usr/share/nginx/html;
        index index.html;
      }
    }

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Here, &lt;code&gt;default.conf&lt;/code&gt; is a key, and its value is the actual Nginx configuration file.&lt;/p&gt;
&lt;p&gt;Let&apos;s apply it into Kubernetes:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;kubectl apply -f nginx-deployment/nginx-config.yaml
kubectl get configmaps
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You should see:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ kubectl get configmaps

NAME               DATA   AGE
kube-root-ca.crt   1      5d13h
nginx-config       1      101s

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;h4&gt;&lt;strong&gt;How Do We Use This ConfigMap in a Pod?&lt;/strong&gt;&lt;/h4&gt;
&lt;p&gt;We need to mount the above ConfigMap inside our &lt;code&gt;nginx-multiple-nodes&lt;/code&gt; deployment so that every pod automatically loads the config on startup. To do this, let&apos;s just create a new nginx deployment yaml at&lt;code&gt;/home/admin/nginx-deployment/nginx-multiple-nodes-with-custom.yaml&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-multiple-nodes
  namespace: service-type-test
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:latest
        ports:
        - containerPort: 80
        volumeMounts:
        - name: nginx-config-volume
          mountPath: /etc/nginx/conf.d/default.conf  # This is where we inject the file
          subPath: default.conf                      # subPath to tell K8S to only use the value of key &apos;default.conf&apos; from the volume which is a configMap
      volumes:
      - name: nginx-config-volume
        configMap:
          name: nginx-config
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;What’s Happening Here?&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Created a volume for using the configMap whose name is &lt;code&gt;nginx-config&lt;/code&gt; which has been applied in previous step.&lt;/li&gt;
&lt;li&gt;Then mount the &lt;code&gt;nginx-config&lt;/code&gt; ConfigMap &lt;strong&gt;as a file&lt;/strong&gt; in &lt;code&gt;/etc/nginx/conf.d/default.conf&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The original &lt;code&gt;default.conf&lt;/code&gt; comes with the Nginx image will be overwritten thus Nginx will start use this file instead of the original default one.&lt;/li&gt;
&lt;li&gt;If we update the ConfigMap in Kubernetes in future, we can just restart the Nginx pod—&lt;strong&gt;no need to rebuild the container at all&lt;/strong&gt;!&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So now you see, because &lt;strong&gt;Kubernetes treats containers as immutable&lt;/strong&gt;—any manual changes inside a running pod are lost when it restarts. ConfigMaps solve this by separating configuration from the application, making it:&lt;br /&gt;
✅ &lt;strong&gt;Easier to update&lt;/strong&gt; (without modifying the container image)&lt;br /&gt;
✅ &lt;strong&gt;More flexible&lt;/strong&gt; (different configs for different environments)&lt;br /&gt;
✅ &lt;strong&gt;More scalable&lt;/strong&gt; (all pods pull the latest config automatically)&lt;/p&gt;
&lt;p&gt;Let&apos;s delete the existing Nginx deployment and apply the new one:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;kubectl delete -f nginx-deployment/nginx-multiple-nodes.yaml
kubectl apply -f nginx-deployment/nginx-multiple-nodes-with-custom.yaml
kubectl get pods -o wide

&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;&lt;img src=&quot;./nginx-pods-running-in-new-deployment.webp&quot; alt=&quot;nginx pods running in new deployment&quot; /&gt;&lt;/h3&gt;
&lt;h3&gt;Test Again!&lt;/h3&gt;
&lt;p&gt;Go back to my laptop:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;for i in {1..10}; do curl -i -s http://172.16.211.200 | grep &quot;X-Served-By&quot;; done
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Please note, I added &lt;code&gt;-i&lt;/code&gt; to include response headers in the output.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./curl-can-see-headers-added-by-different-workernodes.webp&quot; alt=&quot;curl can see headers added by different worker nodes&quot; /&gt;&lt;/p&gt;
&lt;h3&gt;Any Specialized Tools for Load Testing?&lt;/h3&gt;
&lt;p&gt;Okay, since you asked, let&apos;s use &lt;code&gt;hey&lt;/code&gt; (&lt;a href=&quot;https://github.com/rakyll/hey&quot;&gt;link here&lt;/a&gt;)!&lt;/p&gt;
&lt;p&gt;Install it on my mac:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;brew install hey
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In order to get a clean start on logs, let&apos;s restart our deployment:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;kubectl rollout restart deployment nginx-multiple-nodes -n service-type-test
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is our new pods status:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ kubectl get pods -o wide

NAME                                    READY   STATUS    RESTARTS   AGE     IP            NODE    NOMINATED NODE   READINESS GATES
nginx-multiple-nodes-668cdc96dd-4v8db   1/1     Running   0          2m46s   10.244.4.15   k8s-5   &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
nginx-multiple-nodes-668cdc96dd-cjz2r   1/1     Running   0          2m59s   10.244.1.11   k8s-2   &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
nginx-multiple-nodes-668cdc96dd-zlwqv   1/1     Running   0          2m34s   10.244.2.11   k8s-3   &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;
testpod                                 1/1     Running   0          3d13h   10.244.2.5    k8s-3   &amp;lt;none&amp;gt;           &amp;lt;none&amp;gt;

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Run:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;hey -n 1000 -c 50 http://172.16.211.200
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It means:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Send &lt;code&gt;1000&lt;/code&gt; HTTP requests to &lt;code&gt;http://172.16.211.200&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Send &lt;code&gt;50&lt;/code&gt; requests concurrently.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This is our output:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ hey -n 1000 -c 50 http://172.16.211.200

Summary:
  Total:        0.4999 secs
  Slowest:      0.0759 secs
  Fastest:      0.0135 secs
  Average:      0.0242 secs
  Requests/sec: 2000.3376

  Total data:   599010 bytes
  Size/request: 615 bytes

Response time histogram:
  0.013 [1]     |
  0.020 [114]   |■■■■■■
  0.026 [749]   |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.032 [36]    |■■
  0.038 [14]    |■
  0.045 [10]    |■
  0.051 [0]     |
  0.057 [4]     |
  0.063 [1]     |
  0.070 [27]    |■
  0.076 [18]    |■

Latency distribution:
  10% in 0.0193 secs
  25% in 0.0203 secs
  50% in 0.0211 secs
  75% in 0.0232 secs
  90% in 0.0272 secs
  95% in 0.0549 secs
  99% in 0.0716 secs

Details (average, fastest, slowest):
  DNS+dialup:   0.0004 secs, 0.0135 secs, 0.0759 secs
  DNS-lookup:   0.0000 secs, 0.0000 secs, 0.0000 secs
  req write:    0.0000 secs, 0.0000 secs, 0.0004 secs
  resp wait:    0.0222 secs, 0.0133 secs, 0.0553 secs
  resp read:    0.0003 secs, 0.0000 secs, 0.0047 secs

Status code distribution:
  [200] 974 responses

Error distribution:
  [1]   Get &quot;http://172.16.211.200&quot;: read tcp 172.16.211.1:61257-&amp;gt;172.16.211.200:80: read: connection reset by peer
  [1]   Get &quot;http://172.16.211.200&quot;: read tcp 172.16.211.1:61258-&amp;gt;172.16.211.200:80: read: connection reset by peer
  [1]   Get &quot;http://172.16.211.200&quot;: read tcp 172.16.211.1:61260-&amp;gt;172.16.211.200:80: read: connection reset by peer
  [1]   Get &quot;http://172.16.211.200&quot;: read tcp 172.16.211.1:61263-&amp;gt;172.16.211.200:80: read: connection reset by peer
  [1]   Get &quot;http://172.16.211.200&quot;: read tcp 172.16.211.1:61264-&amp;gt;172.16.211.200:80: read: connection reset by peer
  [1]   Get &quot;http://172.16.211.200&quot;: read tcp 172.16.211.1:61265-&amp;gt;172.16.211.200:80: read: connection reset by peer
  [1]   Get &quot;http://172.16.211.200&quot;: read tcp 172.16.211.1:61266-&amp;gt;172.16.211.200:80: read: connection reset by peer
  [1]   Get &quot;http://172.16.211.200&quot;: read tcp 172.16.211.1:61267-&amp;gt;172.16.211.200:80: read: connection reset by peer
  [1]   Get &quot;http://172.16.211.200&quot;: read tcp 172.16.211.1:61268-&amp;gt;172.16.211.200:80: read: connection reset by peer
  [1]   Get &quot;http://172.16.211.200&quot;: read tcp 172.16.211.1:61269-&amp;gt;172.16.211.200:80: read: connection reset by peer
  [1]   Get &quot;http://172.16.211.200&quot;: read tcp 172.16.211.1:61270-&amp;gt;172.16.211.200:80: read: connection reset by peer
  [1]   Get &quot;http://172.16.211.200&quot;: read tcp 172.16.211.1:61271-&amp;gt;172.16.211.200:80: read: connection reset by peer
  [1]   Get &quot;http://172.16.211.200&quot;: read tcp 172.16.211.1:61272-&amp;gt;172.16.211.200:80: read: connection reset by peer
  [1]   Get &quot;http://172.16.211.200&quot;: read tcp 172.16.211.1:61273-&amp;gt;172.16.211.200:80: read: connection reset by peer
  [1]   Get &quot;http://172.16.211.200&quot;: read tcp 172.16.211.1:61274-&amp;gt;172.16.211.200:80: read: connection reset by peer
  [1]   Get &quot;http://172.16.211.200&quot;: read tcp 172.16.211.1:61275-&amp;gt;172.16.211.200:80: read: connection reset by peer
  [1]   Get &quot;http://172.16.211.200&quot;: read tcp 172.16.211.1:61276-&amp;gt;172.16.211.200:80: read: connection reset by peer
  [1]   Get &quot;http://172.16.211.200&quot;: read tcp 172.16.211.1:61277-&amp;gt;172.16.211.200:80: read: connection reset by peer
  [1]   Get &quot;http://172.16.211.200&quot;: read tcp 172.16.211.1:61278-&amp;gt;172.16.211.200:80: read: connection reset by peer
  [1]   Get &quot;http://172.16.211.200&quot;: read tcp 172.16.211.1:61279-&amp;gt;172.16.211.200:80: read: connection reset by peer
  [1]   Get &quot;http://172.16.211.200&quot;: read tcp 172.16.211.1:61280-&amp;gt;172.16.211.200:80: read: connection reset by peer
  [1]   Get &quot;http://172.16.211.200&quot;: read tcp 172.16.211.1:61281-&amp;gt;172.16.211.200:80: read: connection reset by peer
  [1]   Get &quot;http://172.16.211.200&quot;: read tcp 172.16.211.1:61282-&amp;gt;172.16.211.200:80: read: connection reset by peer
  [1]   Get &quot;http://172.16.211.200&quot;: read tcp 172.16.211.1:61283-&amp;gt;172.16.211.200:80: read: connection reset by peer
  [1]   Get &quot;http://172.16.211.200&quot;: read tcp 172.16.211.1:61284-&amp;gt;172.16.211.200:80: read: connection reset by peer
  [1]   Get &quot;http://172.16.211.200&quot;: read tcp 172.16.211.1:61286-&amp;gt;172.16.211.200:80: read: connection reset by peer

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Do we really trust the data from &lt;code&gt;hey&lt;/code&gt;?&lt;/p&gt;
&lt;p&gt;Now worries, since we have a clean start, we can check Kubernetes Nginx pods logs telling request came from what client!&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ kubectl logs -l app=nginx -n service-type-test --tail=1000 | awk &apos;{print $1}&apos; | grep -E &apos;^[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+$&apos; | sort | uniq -c

    334 10.244.1.1
    235 10.244.2.1
    405 10.244.4.1

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Add up &lt;code&gt;334 + 235 + 405&lt;/code&gt; , it&apos;s &lt;code&gt;974&lt;/code&gt; match to &lt;code&gt;hey&lt;/code&gt; output &lt;code&gt;[200] 974 response&lt;/code&gt; !!!&lt;/p&gt;
&lt;p&gt;I feel so satisifying!&lt;/p&gt;
&lt;p&gt;Wait! You might have noticed, the IP in &lt;code&gt;kubectl logs&lt;/code&gt; are not the IPs in &lt;code&gt;kubectl get pods&lt;/code&gt;, that&apos;s not our  maplap IP... how can we use the data to say &quot;&lt;strong&gt;it match&lt;/strong&gt;&quot; ??&lt;/p&gt;
&lt;p&gt;Good observation!&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;.1 address&lt;/code&gt; in each subnet (e.g., &lt;code&gt;10.244.1.1&lt;/code&gt; and &lt;code&gt;10.244.2.1&lt;/code&gt;) is assigned to &lt;code&gt;cni0&lt;/code&gt;, the bridge interface created by CNI (Flannel in our case). When traffic arrives at a pod, if it comes from another node, it first passes through Flannel&apos;s virtual network (&lt;code&gt;cni0&lt;/code&gt;). By default, Nginx logs the IP of the last network hop—which in this case is the Flannel bridge (cni0) instead of the original client.&lt;/p&gt;
&lt;h1&gt;Comparison between ClusterIP, NodePort, ExternalName and LoadBalancer&lt;/h1&gt;
&lt;p&gt;I know, a comparison table would be alwasy helpful at the end of post!&lt;/p&gt;
&lt;p&gt;Here you go:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Service Type&lt;/th&gt;
&lt;th&gt;Use Case&lt;/th&gt;
&lt;th&gt;How It Works&lt;/th&gt;
&lt;th&gt;When to Use&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ClusterIP&lt;/strong&gt; (Default)&lt;/td&gt;
&lt;td&gt;Internal communication within the cluster&lt;/td&gt;
&lt;td&gt;Creates a stable internal IP that other pods can use&lt;/td&gt;
&lt;td&gt;Use when exposing a service only to other pods (e.g., backend services, databases)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;NodePort&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Expose services externally via a node&apos;s IP and a high-numbered port&lt;/td&gt;
&lt;td&gt;Maps a fixed port (30000-32767) on each node to the service&lt;/td&gt;
&lt;td&gt;Use when external access is needed without a LoadBalancer, mainly for development &amp;amp; testing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;LoadBalancer&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Expose services externally with a dedicated external IP&lt;/td&gt;
&lt;td&gt;Allocates an external IP via cloud provider or MetalLB&lt;/td&gt;
&lt;td&gt;Use when running on a cloud provider or using MetalLB in bare-metal environments&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ExternalName&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Map a service name to an external DNS name&lt;/td&gt;
&lt;td&gt;DNS lookup redirects traffic to an external domain&lt;/td&gt;
&lt;td&gt;Use when integrating Kubernetes services with external systems (e.g., external databases or APIs)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h1&gt;🎉 Congratulations!&lt;/h1&gt;
&lt;p&gt;And that’s a wrap for our Part 5: &lt;code&gt;ExternalName&lt;/code&gt; and &lt;code&gt;LoadBalancer&lt;/code&gt;! 🎉  This one was a deep dive, but seeing everything come together feels amazing. We&apos;ve tackled how Kubernetes handles external services and dynamic traffic distribution—powerful stuff! But we’re not stopping here.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Stay tuned for the next post!&lt;/strong&gt; 😎🔥&lt;/p&gt;
&lt;p&gt;:::info
You&apos;re on a roll! Don&apos;t stop now—check out the full series and level up your Kubernetes skills. Each post builds on the last, so make sure you haven’t missed anything! 👇&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/kubernetes-tutorial-part1&quot;&gt;Part 1&lt;/a&gt;&lt;/strong&gt;, I laid out the &lt;strong&gt;networking plan&lt;/strong&gt;, my &lt;strong&gt;goals for setting up Kubernetes&lt;/strong&gt;, and how to &lt;strong&gt;prepare a base VM image&lt;/strong&gt; for the cluster.&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/tutorial-part2-dns-server-ntp&quot;&gt;Part 2&lt;/a&gt;&lt;/strong&gt;, I walked through &lt;strong&gt;configuring a local DNS server and NTP server&lt;/strong&gt;, essential for stable name resolution and time synchronization across nodes locally. These foundational steps will make our Kubernetes setup smoother&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/part3-kubernetes-cluster-setup&quot;&gt;Part 3&lt;/a&gt;&lt;/strong&gt;, I finished the Kubernetes cluster setup with Flannel, got one Kubernetes master and 4 worker nodes that’s ready for real workloads.&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/part3-kubernetes-cluster-setup&quot;&gt;Part 4&lt;/a&gt;&lt;/strong&gt;, I explored NodePort and ClusterIP,understood the key differences, use cases, and when to choose each for internal and external service access!🔥&lt;/p&gt;
&lt;p&gt;🚀 In Part 5, Current one. I dived into &lt;code&gt;ExternalName&lt;/code&gt; and &lt;code&gt;LoadBalancer&lt;/code&gt; services, uncovering how they handle external access, DNS resolution, and dynamic traffic distribution!
:::&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Mastering Terraform with AWS Guide Part 1: Launch Real AWS Infrastructure with VPC, IAM and EC2</title><link>https://geekcoding101.com/posts/1-terraform-with-aws-iam-ec2</link><guid isPermaLink="true">https://geekcoding101.com/posts/1-terraform-with-aws-iam-ec2</guid><pubDate>Tue, 15 Apr 2025 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;So… you’ve heard about Terraform. Maybe your team is using it, maybe your cloud dreams demand it — or maybe, like me, you’ve been deep in the Kubernetes jungle (&lt;a href=&quot;/posts/kubernetes-tutorial-part1&quot;&gt;My blog posts about K8S&lt;/a&gt;) and now want a declarative friend for AWS too! Either way, welcome aboard. In this post, I’ll walk you through setting up your &lt;a href=&quot;https://www.terraform.io/&quot;&gt;Terraform&lt;/a&gt; with &lt;a href=&quot;https://aws.amazon.com/&quot;&gt;AWS&lt;/a&gt; environment from scratch, on a Mac.&lt;/p&gt;
&lt;p&gt;We’ll start simple and go all the way to managing VPC, Security Groups, IAM users and EC2 infrastructure using best practices -all built using Terraform with AWS. By the end, you’ll not only run Terraform with AWS  — you’ll &lt;em&gt;understand all below questions, just to name a few, such as&lt;/em&gt; how to run terraform with aws, how to create aws ec2 instance using terraform, how to create security group in aws using terraform...fantastic!&lt;/p&gt;
&lt;h1&gt;What is Terraform?&lt;/h1&gt;
&lt;p&gt;&lt;a href=&quot;https://www.terraform.io/&quot;&gt;Terraform&lt;/a&gt; is an &lt;a href=&quot;https://developer.hashicorp.com/terraform/tutorials/aws-get-started/infrastructure-as-code&quot;&gt;&lt;strong&gt;open-source infrastructure as code (IaC)&lt;/strong&gt;&lt;/a&gt; tool created by &lt;a href=&quot;https://www.hashicorp.com/en&quot;&gt;HashiCorp&lt;/a&gt;. It lets you define, provision, and manage cloud infrastructure using human-readable config files written in &lt;a href=&quot;https://github.com/hashicorp/hcl&quot;&gt;HCL (HashiCorp Configuration Language)&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Think of it as Git for your cloud — but with superpowers.&lt;/p&gt;
&lt;p&gt;I know it&apos;s kind of short introduction, let&apos;s look at a real life scenario to understand what it is and why we need it!&lt;/p&gt;
&lt;h1&gt;Why Do We Actually Need Terraform? A Real-Life Scenario&lt;/h1&gt;
&lt;p&gt;Let’s say you’re an ambitious DevOps engineer named Alice. One day your boss comes in hot:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;“Hey Alice! We need 3 EC2 instances on AWS, 2 on Azure, and an S3 bucket for backups. Oh — and don’t forget a VPC, IAM roles, a database, some tags, and make it all &lt;em&gt;repeatable&lt;/em&gt;. By lunch.”&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;No pressure, right?&lt;/p&gt;
&lt;p&gt;Without Terraform with AWS, you&apos;d be:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Clicking through &lt;strong&gt;three&lt;/strong&gt; different cloud consoles 🖱️&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Copy-pasting IPs into random docs 📋&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Forgetting what you named stuff by the third resource 😵‍💫&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Swearing at yourself during the teardown: “Wait, which region was that bucket in?”&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Now imagine doing this &lt;strong&gt;again&lt;/strong&gt; next week — for dev, staging, and prod. Nightmare fuel.&lt;/p&gt;
&lt;h2&gt;Enter Terraform: Your Cloud Wizard&lt;/h2&gt;
&lt;p&gt;Now, we have Terraform with AWS:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;You write the infrastructure &lt;em&gt;once&lt;/em&gt; in &lt;code&gt;.tf&lt;/code&gt; files&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Want 10 EC2s instead of 3? Change &lt;code&gt;count = 10&lt;/code&gt;, re-run&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Need to deploy the same setup on Azure? Change the provider&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Broke something? &lt;code&gt;terraform destroy&lt;/code&gt; to the rescue&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It&apos;s like having a universal &lt;strong&gt;remote control for cloud resources&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;You got this! Terraform with AWS makes managing AWS cloud infrastructure not only repeatable, but also version-controlled — just like your code.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;In Short&lt;/h2&gt;
&lt;p&gt;Terraform keeps your cloud clean, consistent, and version-controlled — no more &lt;em&gt;“what did I click last Tuesday?”&lt;/em&gt; mysteries. It helps you:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Reuse configs like code&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Version control your infrastructure&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Avoid human errors from clicking the wrong dropdown&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Automate across multiple environments (dev, staging, prod)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Sleep better knowing you can recreate your stack in seconds&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So next time someone says &lt;em&gt;“spin up a new environment”&lt;/em&gt;, you won’t sweat it — you’ll &lt;code&gt;terraform apply&lt;/code&gt; and sip your coffee like a boss. ☕ And of course, not just Terraform with AWS, you can work with different providers and maintain consistentce between them easily!&lt;/p&gt;
&lt;h1&gt;Step-by-Step Guide to Learn/Practice Terraform&lt;/h1&gt;
&lt;h2&gt;Install Terraform on macOS&lt;/h2&gt;
&lt;p&gt;Let’s install Terraform using Homebrew:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;brew tap hashicorp/tap 
brew install hashicorp/tap/terraform
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Confirm installation:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;terraform version
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./terraform-version.jpg&quot; alt=&quot;terraform version&quot; title=&quot;terraform version&quot; /&gt;&lt;/p&gt;
&lt;h2&gt;Setup Terraform Aliases (Optional but Awesome)&lt;/h2&gt;
&lt;p&gt;If you&apos;re lazy (like all great engineers), add these aliases to your &lt;code&gt;~/.zshenv&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# Terraform
alias f=&apos;terraform&apos;
alias finit=&apos;terraform init&apos;
alias fv=&apos;terraform validate&apos;
alias fp=&apos;terraform plan&apos;
alias fpo=&apos;terraform plan -output &apos;
alias fa=&apos;terraform apply&apos;
alias faa=&apos;terraform apply --auto-approve&apos;
alias fcon=&apos;terraform console&apos;
alias fgra=&apos;terraform graph&apos;
alias fo=&apos;terraform output &apos;
alias fs=&apos;terraform show &apos;
alias fsj=&apos;terraform show -json &apos;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then run:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;source ~/.zshenv
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Boom. Productivity unlocked. 🚀&lt;/p&gt;
&lt;p&gt;:::info
You might ask... hey, why choose &lt;code&gt;f&lt;/code&gt; as the alias of &lt;code&gt;terraform&lt;/code&gt;? Insted of something &lt;code&gt;tf&lt;/code&gt;?Good question! Because why type two characters? If just one, &lt;code&gt;f&lt;/code&gt; is just under your fingers more convenient than reaching to &lt;code&gt;t&lt;/code&gt;!
:::&lt;/p&gt;
&lt;h2&gt;Beginner Script to Explore Terraform Language&lt;/h2&gt;
&lt;p&gt;Before we jump into Terraform with AWS, we need to make sure we understand how Terraform works without involving AWS.&lt;/p&gt;
&lt;p&gt;Let’s create some &lt;code&gt;.tf&lt;/code&gt; files to practice variables, data sources, and conditionals.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;mkdir -p tutorial/basic
cd tutorial/basic
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Save this as &lt;code&gt;test-vars.tf&lt;/code&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;variable &quot;my-test&quot; {
  type    = number
  default = 123
}

variable &quot;my-map&quot; {
  type = map(any)
  default = {
    &quot;key1&quot; = &quot;value1&quot;
    &quot;key2&quot; = &quot;value2&quot;
  }
}

variable &quot;my-list&quot; {
  type = list(any)
  default = [
    &quot;value1&quot;,
    &quot;value2&quot;
  ]
}

output &quot;my-test&quot; {
  value = {
    value1 = var.my-map[&quot;key1&quot;]
    value2 = var.my-list[0]
  }
}

variable &quot;environment&quot; {
  type    = string
  default = &quot;dev&quot;
}

output &quot;conditional-test-output&quot; {
  value = var.environment == &quot;dev&quot; ? &quot;Development Environment&quot; : &quot;Production Environment&quot;
}

data &quot;local_file&quot; &quot;local_file_example&quot; {
  filename = &quot;${path.module}/test-vars.tf&quot;
}

output &quot;file-content&quot; {
  value = data.local_file.local_file_example.content
}

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Above Terraform script gives us a nice little playground to understand several core features: variables with different &lt;strong&gt;data types&lt;/strong&gt;, &lt;strong&gt;outputs&lt;/strong&gt;, &lt;strong&gt;conditional expressions&lt;/strong&gt;, refer to an element in the &lt;strong&gt;array&lt;/strong&gt; or &lt;strong&gt;map&lt;/strong&gt;, and a &lt;strong&gt;data source&lt;/strong&gt; to read from a local file.&lt;/p&gt;
&lt;p&gt;It starts by defining three variables:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;my-test&lt;/code&gt; is a &lt;strong&gt;number&lt;/strong&gt; type with a default of &lt;code&gt;123&lt;/code&gt;,&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;my-map&lt;/code&gt; is a &lt;strong&gt;map&lt;/strong&gt; with arbitrary values (using &lt;code&gt;any&lt;/code&gt;),&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;my-list&lt;/code&gt; is a &lt;strong&gt;list&lt;/strong&gt; also holding values of any type.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Data Types&lt;/h3&gt;
&lt;p&gt;There is alwasy a question in every programming language, same in Terraform HCL:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;How to declare variables in terraform?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I pulled a table to illustrate &lt;strong&gt;data types&lt;/strong&gt; as below:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Data Type&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;string&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;A sequence of Unicode characters (text)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;variable &quot;env&quot; {      type = string      default = &quot;dev&quot;  }&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;number&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;A numeric value (int or float)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;variable &quot;count&quot; {      type = number      default = 3  }&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;bool&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Boolean (true or false)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;variable &quot;enabled&quot; {      type = bool      default = true  }&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;list(type)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Ordered sequence of values of same type&lt;/td&gt;
&lt;td&gt;&lt;code&gt;variable &quot;names&quot; {      type = list(string)      default = [&quot;a&quot;, &quot;b&quot;]  }&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;map(type)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Key-value pair object with same type values&lt;/td&gt;
&lt;td&gt;&lt;code&gt;variable &quot;tags&quot; {      type = map(string)      default = { &quot;env&quot; = &quot;prod&quot; }  }&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;set(type)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Like a list, but unordered and unique&lt;/td&gt;
&lt;td&gt;&lt;code&gt;variable &quot;unique_ids&quot; {      type = set(string)      default = [&quot;a&quot;, &quot;b&quot;, &quot;a&quot;]  }&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;tuple([types])&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Ordered collection of mixed types&lt;/td&gt;
&lt;td&gt;&lt;code&gt;variable &quot;example&quot; {      type = tuple([string, number])      default = [&quot;x&quot;, 10]  }&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;object({ ... })&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Structured object with named attributes&lt;/td&gt;
&lt;td&gt;&lt;code&gt;variable &quot;config&quot; {      type = object({ name = string, count = number })      default = { name = &quot;x&quot;, count = 1 }  }&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;any&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Wildcard for any type (use sparingly)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;variable &quot;dynamic_input&quot; {      type = any      default = &quot;maybe anything&quot;  }&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3&gt;Output Block&lt;/h3&gt;
&lt;p&gt;Then we have an &lt;code&gt;output &quot;my-test&quot;&lt;/code&gt; block that shows how to extract values from these structures: it pulls &lt;code&gt;key1&lt;/code&gt; from the map and the first element of the list. This block showcases &lt;strong&gt;interpolation&lt;/strong&gt; and &lt;strong&gt;indexing&lt;/strong&gt;. With &lt;code&gt;output&lt;/code&gt;, after running &lt;code&gt;terraform apply&lt;/code&gt;, this output will display &lt;code&gt;value1&lt;/code&gt; and the first item from &lt;code&gt;my-list&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;:::info
&lt;strong&gt;Output&lt;/strong&gt; is like Terraform’s way of giving you results or data to feed into other modules, in other words, whatever data we&apos;re interested in.
:::&lt;/p&gt;
&lt;h3&gt;Conditional Expression&lt;/h3&gt;
&lt;p&gt;We also introduce a &lt;code&gt;variable &quot;environment&quot;&lt;/code&gt; set to &lt;code&gt;&quot;dev&quot;&lt;/code&gt;, and use a &lt;strong&gt;conditional expression&lt;/strong&gt; in &lt;code&gt;output &quot;conditional-test-output&quot;&lt;/code&gt; to return a string based on its value—mimicking basic logic without needing an &lt;code&gt;if&lt;/code&gt; block.&lt;/p&gt;
&lt;p&gt;:::warning
In &lt;strong&gt;Terraform&lt;/strong&gt;, there’s no traditional &lt;code&gt;if-else&lt;/code&gt; block like in many programming languages, but &lt;strong&gt;conditional expressions&lt;/strong&gt; serve a similar purpose.
:::&lt;/p&gt;
&lt;h3&gt;Data Resrouce Block&lt;/h3&gt;
&lt;p&gt;Finally, there&apos;s a &lt;strong&gt;data resource&lt;/strong&gt;: &lt;code&gt;data &quot;local_file&quot;&lt;/code&gt;, which loads the content of a file named &lt;code&gt;test-vars.tf&lt;/code&gt; located in the same module directory, and outputs its content. This is a powerful feature when your Terraform config needs to reference external data—like existing files, templates, or config artifacts.&lt;/p&gt;
&lt;h3&gt;Terraform Commands&lt;/h3&gt;
&lt;p&gt;To manage infrastructure effectively with Terraform, there’s a standard lifecycle of commands that help you maintain control and visibility over changes.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;terraform init&lt;/code&gt;&lt;/strong&gt; initializes the working directory containing the &lt;code&gt;.tf&lt;/code&gt; files. It downloads the necessary provider plugins (like AWS) and prepares the backend if we&apos;re using one. This step is required before any other command.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;terraform validate&lt;/code&gt;&lt;/strong&gt; performs a syntax check on your configuration files to ensure everything is well-formed. It catches structural issues early but doesn&apos;t check the actual resource existence or cloud-level constraints.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;terraform plan&lt;/code&gt;&lt;/strong&gt; creates an execution plan showing what actions Terraform will take. You might ask, &quot;How to read terraform plan output&quot;, it&apos;s simle! It compares desired state (as defined in the code) with the current state and highlights what will be created, changed, or destroyed—without actually applying any changes.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;terraform apply&lt;/code&gt;&lt;/strong&gt; executes the actions proposed by the plan, provisioning or modifying infrastructure to match your configuration. This is when Terraform interacts with AWS (or other providers) to make things real.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;terraform output&lt;/code&gt;&lt;/strong&gt; displays the values defined in your &lt;code&gt;output&lt;/code&gt; blocks after a successful apply. It’s commonly used to retrieve resource attributes (like instance IPs or ARNs) needed for further automation or verification.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Run &lt;code&gt;test-vars.tf&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;Let&apos;s use above &lt;code&gt;test-vars.tf&lt;/code&gt; file to practice above commands:&lt;/p&gt;
&lt;p&gt;[caption id=&quot;attachment_4994&quot; align=&quot;aligncenter&quot; width=&quot;1366&quot;]&lt;img src=&quot;./test-var-terraform-init.jpg&quot; alt=&quot;perform terraform init on test-var.tf&quot; title=&quot;perform terraform init on test-var.tf&quot; /&gt; perform terraform init on test-var.tf[/caption]&lt;/p&gt;
&lt;p&gt;[caption id=&quot;attachment_4998&quot; align=&quot;aligncenter&quot; width=&quot;1288&quot;]&lt;img src=&quot;./test-var-terraform-validate.jpg&quot; alt=&quot;perform terraform validate on test-var.tf&quot; title=&quot;perform terraform validate on test-var.tf&quot; /&gt; perform terraform validate on test-var.tf[/caption]&lt;/p&gt;
&lt;p&gt;[caption id=&quot;attachment_4997&quot; align=&quot;aligncenter&quot; width=&quot;1682&quot;]&lt;img src=&quot;./test-var-terraform-plan.jpg&quot; alt=&quot;perform terraform plan on test-var.tf&quot; title=&quot;perform terraform plan on test-var.tf&quot; /&gt; perform terraform plan on test-var.tf[/caption]&lt;/p&gt;
&lt;p&gt;[caption id=&quot;attachment_4995&quot; align=&quot;aligncenter&quot; width=&quot;1118&quot;]&lt;img src=&quot;./test-var-terraform-output.jpg&quot; alt=&quot;perform terraform output on test-var.tf&quot; title=&quot;perform terraform output on test-var.tf&quot; /&gt; perform terraform output on test-var.tf[/caption]&lt;/p&gt;
&lt;h1&gt;Add AWS Capability into Terraform&lt;/h1&gt;
&lt;p&gt;One of the best parts about using Terraform with AWS is how easily you can spin up and tear down entire environments with a single command. Let’s get our local environment ready for cloud magic.&lt;/p&gt;
&lt;h2&gt;Setup AWS Root Account and Create IAM User&lt;/h2&gt;
&lt;p&gt;First, head over to &lt;a href=&quot;https://aws.amazon.com&quot;&gt;aws.amazon.com&lt;/a&gt; and create a root account if you haven’t already.&lt;/p&gt;
&lt;p&gt;Then, inside the AWS Console:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Navigate to &lt;strong&gt;IAM &amp;gt; Users&lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Create a user named &lt;strong&gt;&lt;code&gt;terraform-admin&lt;/code&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Unchecked &quot;Users must create a new password at next sign-in&quot;, no need for testing purpose&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Grant it the &lt;strong&gt;&lt;code&gt;AdministratorAccess&lt;/code&gt;&lt;/strong&gt; (AWS Managed Policy)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Enable programmatic access (you&apos;ll need the access key + secret)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Console access is optional, I granted it anyway&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This IAM user will act as our Terraform operator.&lt;/p&gt;
&lt;p&gt;A few screenshots:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./IAM-user-creation-step1.jpg&quot; alt=&quot;terraform with aws to create AWS IAM user - grant console and set password&quot; title=&quot;terraform with aws to create AWS IAM user - grant console and set password&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./IAM-user-creation-step2.jpg&quot; alt=&quot;terraform with aws to create AWS IAM user - grant console and set password&quot; title=&quot;terraform with aws to create AWS IAM user - grant console and set password&quot; /&gt;&lt;/p&gt;
&lt;p&gt;After user creation, then we need to create access id and secret key for Terraform as it needs authentication.&lt;/p&gt;
&lt;p&gt;Click the user in IAM, go to &quot;Security Credentials&quot; tab, scroll down to find &quot;Access Key&quot; section, create it as below:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./IAM-user-creation-step3.jpg&quot; alt=&quot;create AWS IAM user access id and secret key&quot; title=&quot;create AWS IAM user access id and secret key&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./IAM-user-creation-step4.jpg&quot; alt=&quot;download AWS IAM user access Id and secret key&quot; title=&quot;download AWS IAM user access Id and secret key&quot; /&gt;&lt;/p&gt;
&lt;p&gt;We will use the IAM user and the credentials later when terraform &lt;code&gt;.tf&lt;/code&gt; files interacting with AWS.&lt;/p&gt;
&lt;h2&gt;Create IAM Users in Terraform With AWS&lt;/h2&gt;
&lt;p&gt;Finally we&apos;re here to explain &quot;how to create iam user in aws using terraform&quot;... No worries.&lt;/p&gt;
&lt;p&gt;Honestly, Terraform doesn’t have much cryptic or hard-to-read syntax—it’s pretty clean. But there&apos;re two features I want to highlight: the &lt;code&gt;count&lt;/code&gt; and &lt;code&gt;for_each&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;You might not need either of them on day one, but once you start managing resources with repeatable nested blocks—like multiple tags, multiple ingress rules, or custom configurations per item—it quickly becomes a favorite.&lt;/p&gt;
&lt;p&gt;Here I want to demo the usage of each for our first Terraform with AWS blog series by provisioning multiple AWS IAM users using Terraform with AWS, it&apos;s a practical AWS-focused script to create multiple IAM users actually.&lt;/p&gt;
&lt;p&gt;Let&apos;s first focus on &lt;code&gt;Count&lt;/code&gt; first.&lt;/p&gt;
&lt;h3&gt;AWS Authentication Setup in Terraform&lt;/h3&gt;
&lt;p&gt;Before &lt;code&gt;.tf&lt;/code&gt; files to work in Terrafrom with AWS, Terraform needs to get credentials of AWS IAM user so that it can be authenticated successfully. There are different methods to authenticate Terraform. Here I am using environment variables.&lt;/p&gt;
&lt;p&gt;:::warning
Please replace &lt;code&gt;your_access_key_id&lt;/code&gt; and &lt;code&gt;your_secret_access_key&lt;/code&gt; with the credentials of IAM user created earlier in below.
:::&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;export AWS_ACCESS_KEY_ID=&apos;your_access_key_id&apos;
export AWS_SECRET_ACCESS_KEY=&apos;your_secret_access_key&apos;
export AWS_DEFAULT_REGION=&apos;us-west-1&apos;
export AWS_PROFILE=&quot;default&quot;
export AWS_CONFIG_FILE=&quot;$HOME/.aws/config&quot;
export AWS_SHARED_CREDENTIALS_FILE=&quot;$HOME/.aws/credentials&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let&apos;s create folder:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;mkdir -p tutorial/aws-iam
cd tutorial/aws-iam
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Count Usage&lt;/h3&gt;
&lt;p&gt;This script &lt;code&gt;aws_iam.tf&lt;/code&gt; looks like below initially without using &lt;code&gt;count&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;terraform {
  required_providers {
    aws = {
      source  = &quot;hashicorp/aws&quot;
      version = &quot;~&amp;gt; 5.0&quot;
    }
  }
}

resource &quot;aws_iam_user&quot; &quot;terraform_user_0&quot; {
  name = &quot;terraform-user-0&quot;
}

resource &quot;aws_iam_user&quot; &quot;terraform_user_1&quot; {
  name = &quot;terraform-user-1&quot;
}

resource &quot;aws_iam_user&quot; &quot;terraform_user_2&quot; {
  name = &quot;terraform-user-2&quot;
}

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;:::warning
IAM users are global, so no need to specify &lt;code&gt;region&lt;/code&gt; for them.
:::&lt;/p&gt;
&lt;p&gt;Emm, how to use count in terraform to simplify above code?&lt;/p&gt;
&lt;p&gt;You got this! Here we improve scalability with &lt;code&gt;count&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;terraform {
  required_providers {
    aws = {
      source  = &quot;hashicorp/aws&quot;
      version = &quot;~&amp;gt; 5.0&quot;
    }
  }
}

variable &quot;user_prefix&quot; {
  type    = string
  default = &quot;terraform-user&quot;
}

variable &quot;user_count&quot; {
  type    = number
  default = 3
}

resource &quot;aws_iam_user&quot; &quot;terraform_user&quot; {
  count = var.user_count

  name = &quot;${var.user_prefix}-${count.index}&quot;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So now, the benefits is obvious - No hard-coded user name and it&apos;s easier to scale!&lt;/p&gt;
&lt;h3&gt;for_each Usage&lt;/h3&gt;
&lt;p&gt;Now, how to use for_each in terraform? Let&apos;s focus on &lt;code&gt;for_each&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;While both approaches (&lt;code&gt;for_each&lt;/code&gt; and &lt;code&gt;count&lt;/code&gt;) are valid in this Terraform with AWS code, they serve different purposes depending on the use case.&lt;/p&gt;
&lt;p&gt;Here’s how to create multiple IAM users using &lt;code&gt;for_each&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;terraform {
  required_providers {
    aws = {
      source  = &quot;hashicorp/aws&quot;
      version = &quot;~&amp;gt; 5.0&quot;
    }
  }
}

variable &quot;user_prefix&quot; {
  type    = string
  default = &quot;terraform-user&quot;
}

variable &quot;user_count&quot; {
  type    = number
  default = 3
}

locals {
  user_names = [for i in range(var.user_count) : &quot;${var.user_prefix}-${i}&quot;]
}

resource &quot;aws_iam_user&quot; &quot;terraform_user&quot; {
  for_each = toset(local.user_names)

  name = each.value
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This approach uses a local variable to generate a set of user names and then iterates over each unique name using &lt;code&gt;for_each&lt;/code&gt;. Each item in the set becomes a resource instance with its own lifecycle, based on the value of &lt;code&gt;each.value&lt;/code&gt;.&lt;/p&gt;
&lt;h3&gt;Count vs. for_each: When to Use Which&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;&lt;code&gt;count&lt;/code&gt;&lt;/th&gt;
&lt;th&gt;&lt;code&gt;for_each&lt;/code&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Input type&lt;/td&gt;
&lt;td&gt;Integer&lt;/td&gt;
&lt;td&gt;Set, map, or other collection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Index reference&lt;/td&gt;
&lt;td&gt;&lt;code&gt;count.index&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;each.key&lt;/code&gt; / &lt;code&gt;each.value&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Resource tracking&lt;/td&gt;
&lt;td&gt;Index-based&lt;/td&gt;
&lt;td&gt;Value/key-based&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reordering impact&lt;/td&gt;
&lt;td&gt;Can recreate resources on list changes&lt;/td&gt;
&lt;td&gt;More stable; avoids recreation if values remain&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best suited for&lt;/td&gt;
&lt;td&gt;Identical resources with predictable count&lt;/td&gt;
&lt;td&gt;Resources that need to be uniquely identified&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Use &lt;code&gt;count&lt;/code&gt; when you need a fixed number of uniform resources and the specific identity of each resource doesn’t matter. Use &lt;code&gt;for_each&lt;/code&gt; when you&apos;re dealing with uniquely named resources or working with sets/maps — especially in scenarios where identity and lifecycle tracking are important.&lt;/p&gt;
&lt;p&gt;Both approaches are fully supported, and the choice should be guided by the structure of your data and the operational needs of your infrastructure.&lt;/p&gt;
&lt;h1&gt;Advanced: Create AWS VPC/Network/EC2 With Security Groups&lt;/h1&gt;
&lt;p&gt;Whether you&apos;re building a simple EC2 instance or managing complex networking, Terraform with AWS keeps everything declarative and under control.&lt;/p&gt;
&lt;p&gt;Now, let’s build a practical example that:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Create a VPC and subnet in AWS&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Set up internet access&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Add a &lt;strong&gt;security group&lt;/strong&gt; that allows &lt;strong&gt;SSH and HTTP&lt;/strong&gt; inbound, and &lt;strong&gt;all traffic outbound&lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Create an EC2 instance and attach security group&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Directory Structure&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;tutorial/aws-vpc-ec2-demo/
├── main.tf
├── network.tf
├── internet_gateway.tf
├── route_table.tf
├── security_group.tf
├── ec2.tf
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Get into the folder:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;mkdir -p tutorial/aws-vpc-ec2-demo
cd tutorial/aws-vpc-ec2-demo

&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;main.tf&lt;/h2&gt;
&lt;p&gt;This section is always required when working on Terraform with AWS.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;terraform {
  required_providers {
    aws = {
      source  = &quot;hashicorp/aws&quot;
      version = &quot;~&amp;gt; 5.0&quot;
    }
  }
}

provider &quot;aws&quot; {
  region = &quot;us-west-1&quot;
}

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The block of code in above is the essential handshake between Terraform and AWS. The &lt;code&gt;terraform&lt;/code&gt; block specifies that your configuration requires the &lt;strong&gt;AWS provider&lt;/strong&gt;, sourced from &lt;strong&gt;HashiCorp’s registry&lt;/strong&gt;, and locked to version &lt;code&gt;~&amp;gt; 5.0&lt;/code&gt;, which means any non-breaking updates in the 5.x series are acceptable. This ensures compatibility and stability across Terraform runs.&lt;/p&gt;
&lt;p&gt;About the version match, I pulled this table for your quick references:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Version Constraint&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;th&gt;Example Allowed Versions&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;~&amp;gt; 3.5&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Allow patch-level updates within 3.x (&amp;gt;=3.5.0, &amp;lt;4.0.0)&lt;/td&gt;
&lt;td&gt;3.5.0, 3.5.1, 3.6.0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;~&amp;gt; 3&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Allow any version within major version 3 (&amp;gt;=3.0.0, &amp;lt;4.0.0)&lt;/td&gt;
&lt;td&gt;3.0.0, 3.5.2, 3.99.99&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;&amp;gt;= 3.5, &amp;lt; 3.8&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Allow only versions in a specific minor range&lt;/td&gt;
&lt;td&gt;3.5.0, 3.6.1, 3.7.9&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;= 3.5.2&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Pin to a specific version only&lt;/td&gt;
&lt;td&gt;Only 3.5.2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;&amp;gt; 3.5&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Allow any version greater than 3.5 (but not 3.5)&lt;/td&gt;
&lt;td&gt;3.6.0, 4.0.0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;&amp;lt;= 3.5&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Allow versions less than or equal to 3.5&lt;/td&gt;
&lt;td&gt;3.0.0, 3.4.9, 3.5.0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The &lt;code&gt;provider &quot;aws&quot;&lt;/code&gt; block then sets the context for how Terraform interacts with your AWS environment — in this case, targeting the &lt;code&gt;us-west-1&lt;/code&gt; region. This tells Terraform, “Hey, deploy all the resources in the California region.” By declaring the provider and version this way, we&apos;re building a reproducible, consistent infrastructure-as-code setup that won’t break unexpectedly if a newer major version of the provider is released.&lt;/p&gt;
&lt;p&gt;:::info
Since we&apos;ve export credentials in Environment Variables, so no need to specify credentials in `provider aws` section.
:::&lt;/p&gt;
&lt;h2&gt;network.tf&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;resource &quot;aws_vpc&quot; &quot;main&quot; {
  cidr_block = &quot;10.0.0.0/16&quot;
}

resource &quot;aws_subnet&quot; &quot;main&quot; {
  vpc_id                  = aws_vpc.main.id
  cidr_block              = &quot;10.0.1.0/24&quot;
  map_public_ip_on_launch = true
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In AWS, every resource lives inside a Virtual Private Cloud (VPC). This Terraform with AWS block creates a custom VPC with a &lt;code&gt;/16&lt;/code&gt; CIDR block, which allows for 65,536 private IP addresses — a large range that gives you plenty of room to grow. By the way, AWS automatically creates a default VPC in every region.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;aws_subnet&lt;/code&gt; slices a &lt;code&gt;/24&lt;/code&gt; block from the VPC — allowing 256 IPs (minus AWS reservations). The critical flag here is &lt;code&gt;map_public_ip_on_launch = true&lt;/code&gt;. Without this, your EC2 instances won’t get a public IP, and you&apos;ll be stuck trying to SSH into a black hole. With this setting enabled, instances launched into this subnet will be publicly addressable.&lt;/p&gt;
&lt;h2&gt;internet_gateway.tf&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;resource &quot;aws_internet_gateway&quot; &quot;igw&quot; {
  vpc_id = aws_vpc.main.id
}
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;route_table.tf&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;resource &quot;aws_route_table&quot; &quot;public&quot; {
  vpc_id = aws_vpc.main.id

  route {
    cidr_block = &quot;0.0.0.0/0&quot;
    gateway_id = aws_internet_gateway.igw.id
  }

  tags = {
    Name = &quot;PublicRouteTable&quot;
  }
}

resource &quot;aws_route_table_association&quot; &quot;public_subnet&quot; {
  subnet_id      = aws_subnet.main.id
  route_table_id = aws_route_table.public.id
}

&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;security_group.tf&lt;/h2&gt;
&lt;p&gt;Security groups in AWS are like bouncers for your EC2 instances — they control what traffic is allowed in or out of your virtual machines. Whether you&apos;re allowing SSH for remote access or HTTP for your website, security groups are your first line of defense.&lt;/p&gt;
&lt;p&gt;When working on Terraform with AWS, you &lt;em&gt;can&lt;/em&gt; define security rules inline within the &lt;code&gt;aws_security_group&lt;/code&gt; resource. HashiCorp recommends using &lt;a href=&quot;https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/vpc_security_group_ingress_rule&quot;&gt;&lt;strong&gt;dedicated resources&lt;/strong&gt; for ingress/egress rules now&lt;/a&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;aws_vpc_security_group_ingress_rule&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;aws_vpc_security_group_egress_rule&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Previsouly, Hashicorp provides &lt;code&gt;ingress&lt;/code&gt; and &lt;code&gt;egress&lt;/code&gt; arguments of the &lt;a href=&quot;https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/security_group&quot;&gt;&lt;code&gt;aws_security_group&lt;/code&gt;&lt;/a&gt; resource for configuring in-line rules. But they struggle with managing multiple CIDR blocks, and tags and descriptions due to the historical lack of unique IDs. So now using &lt;a href=&quot;https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/vpc_security_group_egress_rule&quot;&gt;&lt;code&gt;aws_vpc_security_group_egress_rule&lt;/code&gt;&lt;/a&gt; and &lt;code&gt;aws_vpc_security_group_ingress_rule&lt;/code&gt; resources is the current best practice.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;resource &quot;aws_security_group&quot; &quot;web_sg&quot; {
  name        = &quot;web_sg&quot;
  description = &quot;Allow HTTP and SSH&quot;
  vpc_id      = aws_vpc.main.id
}

resource &quot;aws_vpc_security_group_ingress_rule&quot; &quot;http_in&quot; {
  security_group_id = aws_security_group.web_sg.id
  cidr_ipv4         = &quot;0.0.0.0/0&quot;
  from_port         = 80
  to_port           = 80
  ip_protocol       = &quot;tcp&quot;
}

resource &quot;aws_vpc_security_group_ingress_rule&quot; &quot;ssh_in&quot; {
  security_group_id = aws_security_group.web_sg.id
  cidr_ipv4         = &quot;0.0.0.0/0&quot;
  from_port         = 22
  to_port           = 22
  ip_protocol       = &quot;tcp&quot;
}

resource &quot;aws_vpc_security_group_egress_rule&quot; &quot;all_out&quot; {
  security_group_id = aws_security_group.web_sg.id
  cidr_ipv4         = &quot;0.0.0.0/0&quot;
  from_port         = 0
  to_port           = 0
  ip_protocol       = &quot;-1&quot;
}

&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;ec2.tf&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;data &quot;aws_ami&quot; &quot;amazon_linux_2&quot; {
  most_recent = true
  owners      = [&quot;amazon&quot;]

  filter {
    name   = &quot;name&quot;
    values = [&quot;amzn2-ami-hvm-*-x86_64-gp2&quot;]
  }
}

resource &quot;aws_instance&quot; &quot;web&quot; {
  ami                         = data.aws_ami.amazon_linux_2.id
  instance_type               = &quot;t2.micro&quot;
  subnet_id                   = aws_subnet.main.id
  vpc_security_group_ids      = [aws_security_group.web_sg.id]
  associate_public_ip_address = true

  tags = {
    Name = &quot;terraform&quot;
  }

  user_data = &amp;lt;&amp;lt;-EOF
              #!/bin/bash
              sudo amazon-linux-extras enable nginx1
              sudo yum clean metadata
              sudo yum install -y nginx
              sudo systemctl start nginx
              EOF
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Here we&apos;re launching a micro-sized instance using the latest AMI using Terrafrom with AWS, placing it in our subnet, and attaching a security group that allows HTTP traffic.&lt;/p&gt;
&lt;p&gt;It&apos;s using the latest Amazon Linux 2 AMI — all without hardcoding image IDs as we have pulled the &lt;code&gt;aws_ami&lt;/code&gt; resource and filtered our required AMI images. There might be multiple AMI available, &lt;code&gt;most_recent&lt;/code&gt; will ensure it pick up the latest one. Specifying &lt;code&gt;t2.micro&lt;/code&gt; is important, because it&apos;s available for free tier of AWS account, I don&apos;t want AWS bill surprise me...&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;user_data&lt;/code&gt; block runs a bash script to install and start NGINX right after the instance boots — voilà, instant web server! 🎉&lt;/p&gt;
&lt;h2&gt;Terraform Graph&lt;/h2&gt;
&lt;p&gt;When working with even a moderately sized Terraform with AWS project—like our &lt;code&gt;aws-vpc-ec2-demo&lt;/code&gt; that stitches together VPCs, subnets, security groups, EC2 instances, internet gateways, and more—keeping track of how all the resources relate to each other can get a bit overwhelming. That’s where the magic of &lt;code&gt;terraform graph&lt;/code&gt; comes in.&lt;/p&gt;
&lt;p&gt;Terraform automatically analyzes all your &lt;code&gt;.tf&lt;/code&gt; files and maps out the dependencies between resources, so it knows exactly what needs to be created first, what depends on what, and how to destroy them safely in reverse order. It builds a dependency graph internally—and you can view this visually by piping the output of &lt;code&gt;terraform graph&lt;/code&gt; into a tool like Graphviz.  It&apos;s an eye-opener for understanding Terraform’s internal logic and a fantastic way to document and debug your setup.&lt;/p&gt;
&lt;p&gt;Just perform:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;terraform graph
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You will get:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ terraform graph
digraph G {
  rankdir = &quot;RL&quot;;
  node [shape = rect, fontname = &quot;sans-serif&quot;];
  &quot;data.aws_ami.amazon_linux_2&quot; [label=&quot;data.aws_ami.amazon_linux_2&quot;];
  &quot;aws_instance.web&quot; [label=&quot;aws_instance.web&quot;];
  &quot;aws_internet_gateway.igw&quot; [label=&quot;aws_internet_gateway.igw&quot;];
  &quot;aws_route_table.public&quot; [label=&quot;aws_route_table.public&quot;];
  &quot;aws_route_table_association.public_subnet&quot; [label=&quot;aws_route_table_association.public_subnet&quot;];
  &quot;aws_security_group.web_sg&quot; [label=&quot;aws_security_group.web_sg&quot;];
  &quot;aws_subnet.main&quot; [label=&quot;aws_subnet.main&quot;];
  &quot;aws_vpc.main&quot; [label=&quot;aws_vpc.main&quot;];
  &quot;aws_vpc_security_group_egress_rule.all_out&quot; [label=&quot;aws_vpc_security_group_egress_rule.all_out&quot;];
  &quot;aws_vpc_security_group_ingress_rule.http_in&quot; [label=&quot;aws_vpc_security_group_ingress_rule.http_in&quot;];
  &quot;aws_vpc_security_group_ingress_rule.ssh_in&quot; [label=&quot;aws_vpc_security_group_ingress_rule.ssh_in&quot;];
  &quot;aws_instance.web&quot; -&amp;gt; &quot;data.aws_ami.amazon_linux_2&quot;;
  &quot;aws_instance.web&quot; -&amp;gt; &quot;aws_security_group.web_sg&quot;;
  &quot;aws_instance.web&quot; -&amp;gt; &quot;aws_subnet.main&quot;;
  &quot;aws_internet_gateway.igw&quot; -&amp;gt; &quot;aws_vpc.main&quot;;
  &quot;aws_route_table.public&quot; -&amp;gt; &quot;aws_internet_gateway.igw&quot;;
  &quot;aws_route_table_association.public_subnet&quot; -&amp;gt; &quot;aws_route_table.public&quot;;
  &quot;aws_route_table_association.public_subnet&quot; -&amp;gt; &quot;aws_subnet.main&quot;;
  &quot;aws_security_group.web_sg&quot; -&amp;gt; &quot;aws_vpc.main&quot;;
  &quot;aws_subnet.main&quot; -&amp;gt; &quot;aws_vpc.main&quot;;
  &quot;aws_vpc_security_group_egress_rule.all_out&quot; -&amp;gt; &quot;aws_security_group.web_sg&quot;;
  &quot;aws_vpc_security_group_ingress_rule.http_in&quot; -&amp;gt; &quot;aws_security_group.web_sg&quot;;
  &quot;aws_vpc_security_group_ingress_rule.ssh_in&quot; -&amp;gt; &quot;aws_security_group.web_sg&quot;;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let&apos;s paste above into https://dreampuf.github.io/GraphvizOnline/, we can get graph as below:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./aws-vpc-ec2-demo-dependency-graph.png&quot; alt=&quot;The dependency graph of aws-vpc-ec2-demo&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Helpful, right?Once we understand how Terraform with AWS handles dependencies and state, our infrastructure starts to feel like elegant code — not chaos.&lt;/p&gt;
&lt;h2&gt;Let&apos;s give it a try!&lt;/h2&gt;
&lt;p&gt;![terraform with aws vpc and ec2 demo&apos;s output of &quot;terraform apply&quot;](./aws-vpc-ec2-demo-terraform-apply-1.jpg &quot;terraform with aws vpc and ec2 demo&apos;s output of &quot;terraform apply&quot;&quot;)&lt;/p&gt;
&lt;p&gt;Let&apos;s get the public IP and test Nginx:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./aws-vpc-ec2-demo-test-nginx-1.jpg&quot; alt=&quot;aws-vpc-ec2-demo test nginx&quot; title=&quot;aws-vpc-ec2-demo test nginx&quot; /&gt;&lt;/p&gt;
&lt;h1&gt;Hooray!&lt;/h1&gt;
&lt;p&gt;Spent nearly a week putting together this first post on &quot;Terraform with AWS&quot; guide — it&apos;s not that complicated, because I wanted every command, every config, and every explanation to &lt;em&gt;click&lt;/em&gt; for anyone following along. From VPC, IAM users, Security Groups to EC2 and best practices, I’ve covered the real stuff you&apos;d face when building infrastructure from scratch using Terraform with AWS. 💻☁️&lt;/p&gt;
&lt;p&gt;This blog series is all about mastering &lt;strong&gt;Terraform with AWS&lt;/strong&gt; from the ground up — no shortcuts, just clean, scalable infrastructure-as-code.&lt;/p&gt;
&lt;p&gt;Up next? We&apos;re going beyond the basics — deploying a fully working EKS (Elastic Kubernetes Service) cluster with Terraform. If you thought this post was useful, wait until you see what’s coming. Buckle up, cloud wranglers. 🚀&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Stay tuned for the next post!&lt;/strong&gt; 😎🔥&lt;/p&gt;
&lt;p&gt;:::info
I’ve pushed everything to GitHub for you! You can find all the Terraform scripts from this blog post right here: 👉 &lt;a href=&quot;https://github.com/geekcoding101/iac/tree/main/terraform/tutorial&quot;&gt;https://github.com/geekcoding101/iac/tree/main/terraform/tutorial&lt;/a&gt;  🚀
:::&lt;/p&gt;
&lt;p&gt;:::info
You&apos;re on a roll! Don&apos;t stop now—check out the full series of Kubernetes and level up your Kubernetes skills. Each post builds on the last, so make sure you haven’t missed anything! 👇&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/kubernetes-tutorial-part1&quot;&gt;Part 1&lt;/a&gt;&lt;/strong&gt;, I laid out the &lt;strong&gt;networking plan&lt;/strong&gt;, my &lt;strong&gt;goals for setting up Kubernetes&lt;/strong&gt;, and how to &lt;strong&gt;prepare a base VM image&lt;/strong&gt; for the cluster.&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/tutorial-part2-dns-server-ntp&quot;&gt;Part 2&lt;/a&gt;&lt;/strong&gt;, I walked through &lt;strong&gt;configuring a local DNS server and NTP server&lt;/strong&gt;, essential for stable name resolution and time synchronization across nodes locally. These foundational steps will make our Kubernetes setup smoother&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/part3-kubernetes-cluster-setup&quot;&gt;Part 3&lt;/a&gt;&lt;/strong&gt;, I finished the Kubernetes cluster setup with &lt;strong&gt;Flannel&lt;/strong&gt;, got one Kubernetes master and 4 worker nodes that’s ready for real workloads.&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/part3-kubernetes-cluster-setup&quot;&gt;Part 4&lt;/a&gt;&lt;/strong&gt;, I explored &lt;strong&gt;NodePort&lt;/strong&gt; and &lt;strong&gt;ClusterIP&lt;/strong&gt;,understood the key differences, use cases, and when to choose each for internal and external service access!🔥&lt;/p&gt;
&lt;p&gt;🚀 In &lt;a href=&quot;/posts/externalname-loadbalancer-5&quot;&gt;&lt;strong&gt;Part 5&lt;/strong&gt;&lt;/a&gt;, I dived into &lt;strong&gt;&lt;code&gt;ExternalName&lt;/code&gt;&lt;/strong&gt; and &lt;strong&gt;&lt;code&gt;LoadBalancer&lt;/code&gt;&lt;/strong&gt; services, uncovering how they handle external access, DNS resolution, and dynamic traffic distribution!
:::&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Terraform Meta Arguments Unlocked: Practical Patterns for Clean Infrastructure Code</title><link>https://geekcoding101.com/posts/terraform-meta-arguments</link><guid isPermaLink="true">https://geekcoding101.com/posts/terraform-meta-arguments</guid><pubDate>Mon, 21 Apr 2025 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;I’ve always found &lt;a href=&quot;https://www.terraform.io/&quot;&gt;Terraform&lt;/a&gt; &lt;em&gt;meta arguments&lt;/em&gt; a bit confusing at first glance—not &lt;code&gt;count&lt;/code&gt;, &lt;code&gt;for_each&lt;/code&gt;, but things like &lt;code&gt;connection&lt;/code&gt;, &lt;code&gt;provisioner&lt;/code&gt;, &lt;code&gt;depends_on&lt;/code&gt;, &lt;code&gt;source&lt;/code&gt; and &lt;code&gt;lifecycle&lt;/code&gt; often seem straightforward but can behave unexpectedly in different contexts. That’s why I decided to write this blog post: to break them down clearly, explain what each one does, and show practical examples of how and when to use them effectively.&lt;/p&gt;
&lt;h1&gt;Terraform Meta Arguments Table&lt;/h1&gt;
&lt;p&gt;&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Meta-Argument&lt;/th&gt;
&lt;th&gt;Applicable To&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;count&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;resource&lt;/code&gt;, &lt;code&gt;module&lt;/code&gt;, &lt;code&gt;data&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Create multiple instances of a resource or module using a number.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;for_each&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;resource&lt;/code&gt;, &lt;code&gt;module&lt;/code&gt;, &lt;code&gt;data&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Create multiple instances using a map or set of strings. More flexible than count.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;provider&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;resource&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Specify which provider configuration to use if multiple are defined.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;depends_on&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;resource&lt;/code&gt;, &lt;code&gt;module&lt;/code&gt;, &lt;code&gt;data&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Explicitly define dependencies between resources or modules.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;lifecycle&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;resource&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Control resource creation and destruction behavior (e.g., prevent_destroy, ignore_changes).&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;provisioner&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;resource&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Run scripts or commands after a resource is created or destroyed.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;connection&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;resource&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Define how to connect to a remote resource (used with provisioners).&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;source&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;module&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Specify the location of a module (registry, Git, local path, etc.).&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h1&gt;Usage&lt;/h1&gt;
&lt;h2&gt;🧮 Terraform meta arguments: count&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;resource &quot;aws_instance&quot; &quot;web&quot; { 
  count = 3 
  ami = &quot;ami-0c55b159cbfafe1f0&quot; 
  instance_type = &quot;t2.micro&quot; 
  tags = { 
    Name = &quot;Web-${count.index}&quot; 
  }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;✅ Creates multiple instances using a simple integer.&lt;/p&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Attention Notes:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;count.index&lt;/code&gt; starts from &lt;code&gt;0&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Not ideal for working with named collections (use &lt;code&gt;for_each&lt;/code&gt; instead).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Not supported in &lt;code&gt;provider&lt;/code&gt; blocks.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;🔁 Terraform meta arguments: for_each&lt;/h2&gt;
&lt;p&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;resource &quot;aws_s3_bucket&quot; &quot;example&quot; { 
  for_each = toset([&quot;logs&quot;, &quot;media&quot;, &quot;backups&quot;]) 
  bucket = &quot;my-bucket-${each.key}&quot; 
  acl = &quot;private&quot; 
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;✅ More flexible than &lt;code&gt;count&lt;/code&gt;, supports map and set types.&lt;/p&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Attention Notes:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;each.key&lt;/code&gt; and &lt;code&gt;each.value&lt;/code&gt; used depending on collection type.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Keys must be unique.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Best for managing multiple named resources (e.g., per environment).&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;🧩 Terraform meta  arguments: provider&lt;/h2&gt;
&lt;p&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;provider &quot;aws&quot; { 
  region = &quot;us-east-1&quot; 
  alias = &quot;east&quot; 
} 

provider &quot;aws&quot; { 
  region = &quot;us-west-2&quot; 
  alias = &quot;west&quot; 
} 

resource &quot;aws_instance&quot; &quot;example&quot; { 
  provider = aws.west 
  ami = &quot;ami-0c55b159cbfafe1f0&quot; 
  instance_type = &quot;t2.micro&quot; 
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;✅ Specifies a particular provider config when multiple are defined.&lt;/p&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Attention Notes:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Only works in &lt;code&gt;resource&lt;/code&gt;, not &lt;code&gt;module&lt;/code&gt; blocks.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Must use &lt;code&gt;alias&lt;/code&gt; to differentiate provider instances if have the same names.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;🔗 Terraform meta arguments: depends_on&lt;/h2&gt;
&lt;p&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;resource &quot;aws_iam_role&quot; &quot;role&quot; { 
  name = &quot;example-role&quot; 
  assume_role_policy = jsonencode({ 
    Version = &quot;2012-10-17&quot;, 
    Statement = [{ 
      Effect = &quot;Allow&quot;, 
      Principal = { Service = &quot;ec2.amazonaws.com&quot; }, 
      Action = &quot;sts:AssumeRole&quot; 
    }] 
  }) 
} 

resource &quot;aws_iam_role_policy_attachment&quot; &quot;attachment&quot; { 
  role = aws_iam_role.role.name 
  policy_arn = &quot;arn:aws:iam::aws:policy/ReadOnlyAccess&quot; 
  depends_on = [aws_iam_role.role] 
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;✅ Ensures ordering when Terraform can&apos;t automatically infer it.&lt;/p&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Attention Notes:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Use when a dependency is implicit (e.g., &lt;code&gt;local-exec&lt;/code&gt;, provisioners).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Can be used in &lt;code&gt;resource&lt;/code&gt;, &lt;code&gt;module&lt;/code&gt;, and &lt;code&gt;data&lt;/code&gt; blocks.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;♻️ Terraform meta arguments: lifecycle&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;resource &quot;aws_instance&quot; &quot;db&quot; { 
  ami = &quot;ami-0c55b159cbfafe1f0&quot; 
  instance_type = &quot;t3.micro&quot; 

  lifecycle { 
    prevent_destroy = true 
    create_before_destroy = true 
    ignore_changes = [tags[&quot;Owner&quot;]] 
  } 
}
&lt;/code&gt;&lt;/pre&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Lifecycle Argument&lt;/th&gt;
&lt;th&gt;Default&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;create_before_destroy&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;false&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Ensures a new resource is created before the old one is destroyed to avoid downtime. Common in zero-downtime deployments.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;prevent_destroy&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;false&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Prevents a resource from being destroyed. Terraform will produce an error if a destroy is attempted on this resource.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ignore_changes&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;[]&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Ignores changes to specific attributes in future plans. Useful for fields updated externally or during auto-scaling.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;replace_triggered_by&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;[]&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Forces resource replacement when another referenced resource or attribute changes. Introduced in Terraform 0.15+&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;✅ Fine-tunes how Terraform handles changes and destruction.&lt;/p&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Attention Notes:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;prevent_destroy&lt;/code&gt; helps protect critical infra.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;ignore_changes&lt;/code&gt; avoids re-creating resources when certain fields change.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Only supported in &lt;code&gt;resource&lt;/code&gt; blocks.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;💻 Terraform meta arguments: provisioner&lt;/h2&gt;
&lt;p&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;resource &quot;null_resource&quot; &quot;example&quot; { 
  provisioner &quot;local-exec&quot; { 
    command = &quot;echo Hello, Terraform!&quot; 
  } 
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;✅ Executes a script or command after resource creation.&lt;/p&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Attention Notes:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Best for ad-hoc automation or external configuration steps.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Two types: &lt;code&gt;local-exec&lt;/code&gt; and &lt;code&gt;remote-exec&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Not idempotent—Terraform can&apos;t track what was done.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;🔐 Terraform meta arguments: connection&lt;/h2&gt;
&lt;p&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;resource &quot;null_resource&quot; &quot;remote&quot; { 
  provisioner &quot;remote-exec&quot; { 
    inline = [&quot;echo Connected!&quot;] 
  } 

  connection { 
    type = &quot;ssh&quot; 
    user = &quot;ubuntu&quot; 
    host = &quot;1.2.3.4&quot; 
    private_key = file(&quot;~/.ssh/id_rsa&quot;) 
  } 
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;✅ Used with &lt;code&gt;remote-exec&lt;/code&gt; provisioners to connect to VMs or servers.&lt;/p&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Attention Notes:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Only works inside a &lt;code&gt;resource&lt;/code&gt; block.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Requires credentials and reachable IP.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Supported ssh and &lt;a href=&quot;https://learn.microsoft.com/en-us/windows/win32/winrm/portal&quot;&gt;WinRM&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Mostly used in VM provisioning, not cloud-native workflows.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;📦 Terraform meta arguments: source (for modules)&lt;/h2&gt;
&lt;p&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;module &quot;vpc&quot; { 
  source = &quot;terraform-aws-modules/vpc/aws&quot; 
  version = &quot;~&amp;gt; 4.0&quot; 
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;✅ Tells Terraform where to find the module (registry, Git, local, etc.).&lt;/p&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Attention Notes:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Only valid in &lt;code&gt;module&lt;/code&gt; blocks.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;the &lt;code&gt;source&lt;/code&gt; argument in a Terraform &lt;code&gt;module&lt;/code&gt; block &lt;strong&gt;does not support dynamic expressions like variables&lt;/strong&gt;. It must be a &lt;strong&gt;static, known-at-plan-time string&lt;/strong&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;When using the registry, you can also set a &lt;code&gt;version&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h1&gt;Advanced Usage&lt;/h1&gt;
&lt;p&gt;This example is to help me to understand what is a module, how the variables passing between root module and child modules, advanced usage of &lt;code&gt;count&lt;/code&gt;, how to bypass &lt;code&gt;source&lt;/code&gt; limit that it cannot use variables, usage of &lt;code&gt;null_resource&lt;/code&gt; and &lt;code&gt;local-exec&lt;/code&gt; usage.&lt;/p&gt;
&lt;h2&gt;Directory Structure&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;❯ tree
.
├── main.tf
├── modules
│   ├── aws_module
│   │   └── main.tf
│   ├── azure_module
│   │   └── main.tf
│   └── wrapper
│       ├── main.tf
│       └── variables.tf
└── variables.tf
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Code Walkthrough&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;❯ cat main.tf
module &quot;cloud_infra&quot; {
  source        = &quot;./modules/wrapper&quot;
  provider_type = var.provider_type
}

❯ cat variables.tf
variable &quot;provider_type&quot; {
  description = &quot;Which cloud provider to use: &apos;aws&apos; or &apos;azure&apos;&quot;
  type        = string
  default     = &quot;aws&quot;
}
❯ cat modules/wrapper/main.tf
module &quot;aws&quot; {
  source = &quot;../aws_module&quot;
  count  = var.provider_type == &quot;aws&quot; ? 1 : 0
}

module &quot;azure&quot; {
  source = &quot;../azure_module&quot;
  count  = var.provider_type == &quot;azure&quot; ? 1 : 0
}
❯ cat modules/wrapper/variables.tf
variable &quot;provider_type&quot; {
  description = &quot;Cloud provider type: aws or azure&quot;
  type        = string
}
❯ cat modules/aws_module/main.tf
resource &quot;null_resource&quot; &quot;aws_example&quot; {
  provisioner &quot;local-exec&quot; {
    command = &quot;echo Deploying AWS Infrastructure&quot;
  }
}
❯ cat modules/azure_module/main.tf
resource &quot;null_resource&quot; &quot;azure_example&quot; {
  provisioner &quot;local-exec&quot; {
    command = &quot;echo Deploying Azure Infrastructure&quot;
  }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./terraform-plan.jpg&quot; alt=&quot;output of terraform plan&quot; title=&quot;output of terraform plan&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./terraform-apply.jpg&quot; alt=&quot;output of terraform apply&quot; title=&quot;output of terraform apply&quot; /&gt;&lt;/p&gt;
&lt;h2&gt;Takeaways&lt;/h2&gt;
&lt;p&gt;This is just an example to &lt;strong&gt;illustrate the usage of a wrapper module&lt;/strong&gt; in Terraform, but it’s also grounded in &lt;strong&gt;practical value&lt;/strong&gt; you’d encounter in real-world scenarios.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Modular Abstraction&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;The root module only needs to know &lt;strong&gt;what&lt;/strong&gt; to deploy, not &lt;strong&gt;how&lt;/strong&gt; it&apos;s deployed for each provider.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;You use a variable (e.g. &lt;code&gt;provider_type&lt;/code&gt;) to switch between providers.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Wrapper Logic in Its Own Module&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;The wrapper module decides &lt;strong&gt;which provider-specific module&lt;/strong&gt; to load based on the input value.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;This avoids conditional logic spread across the root module.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Directory Structure Reflects Cloud Providers&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Each cloud (AWS, Azure, etc.) has its own submodule with isolated logic.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;This keeps code clean and avoids mixing different cloud resources in the same files.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Dynamic Module Source Selection&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;By using a variable in the module &lt;code&gt;source&lt;/code&gt; path and combine with &lt;code&gt;count&lt;/code&gt;, you can dynamically load the desired submodule (&lt;code&gt;aws&lt;/code&gt;, &lt;code&gt;azure&lt;/code&gt;, etc.).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;This is static at plan/apply time, but flexible from a design perspective.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Encapsulation of Variables&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Variables like &lt;code&gt;provider_type&lt;/code&gt; are &lt;strong&gt;declared at every module level&lt;/strong&gt; that needs them.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;This ensures smooth flow of configuration down from root module → wrapper → cloud module.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Easy to Extend&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Adding support for a new cloud provider (e.g. GCP) is as simple as adding a new folder and extending the wrapper logic — no changes needed in the root module.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h1&gt;Summary&lt;/h1&gt;
&lt;p&gt;Terraform meta arguments—like &lt;code&gt;lifecycle&lt;/code&gt;, and &lt;code&gt;provisioner&lt;/code&gt;—can seem straightforward until you hit real-world use cases. In this post, I broke down each of these Terraform meta arguments with clear explanations, practical examples, and some gotchas to watch out for. I also shared a working demo using a wrapper module pattern to dynamically deploy AWS or Azure modules based on input variables. Whether you&apos;re new to Terraform modules or just want to sharpen your understanding of Terraform meta arguments behaviors, this guide aims to bring clarity to the chaos.&lt;/p&gt;
&lt;p&gt;:::info
I’ve pushed everything to GitHub for you! You can find all the Terraform scripts from this blog post right here: 👉 &lt;a href=&quot;https://github.com/geekcoding101/iac/tree/main/terraform/tutorial&quot;&gt;https://github.com/geekcoding101/iac/tree/main/terraform/tutorial&lt;/a&gt;  🚀
:::&lt;/p&gt;
&lt;p&gt;:::tip
Feel free to check out my other Terraform blog posts:&lt;a href=&quot;/posts/1-terraform-with-aws-iam-ec2&quot;&gt;Mastering Terraform with AWS Guide Part 1: Launch Real AWS Infrastructure with VPC, IAM and EC2&lt;/a&gt;
:::&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;:::info
You&apos;re on a roll! Don&apos;t stop now—check out the full series of Kubernetes and level up your Kubernetes skills. Each post builds on the last, so make sure you haven’t missed anything! 👇&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/kubernetes-tutorial-part1&quot;&gt;Part 1&lt;/a&gt;&lt;/strong&gt;, I laid out the &lt;strong&gt;networking plan&lt;/strong&gt;, my &lt;strong&gt;goals for setting up Kubernetes&lt;/strong&gt;, and how to &lt;strong&gt;prepare a base VM image&lt;/strong&gt; for the cluster.&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/tutorial-part2-dns-server-ntp&quot;&gt;Part 2&lt;/a&gt;&lt;/strong&gt;, I walked through &lt;strong&gt;configuring a local DNS server and NTP server&lt;/strong&gt;, essential for stable name resolution and time synchronization across nodes locally. These foundational steps will make our Kubernetes setup smoother&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/part3-kubernetes-cluster-setup&quot;&gt;Part 3&lt;/a&gt;&lt;/strong&gt;, I finished the Kubernetes cluster setup with &lt;strong&gt;Flannel&lt;/strong&gt;, got one Kubernetes master and 4 worker nodes that’s ready for real workloads.&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/part3-kubernetes-cluster-setup&quot;&gt;Part 4&lt;/a&gt;&lt;/strong&gt;, I explored &lt;strong&gt;NodePort&lt;/strong&gt; and &lt;strong&gt;ClusterIP&lt;/strong&gt;,understood the key differences, use cases, and when to choose each for internal and external service access!🔥&lt;/p&gt;
&lt;p&gt;🚀 In &lt;a href=&quot;/posts/externalname-loadbalancer-5&quot;&gt;&lt;strong&gt;Part 5&lt;/strong&gt;&lt;/a&gt;, I dived into &lt;strong&gt;&lt;code&gt;ExternalName&lt;/code&gt;&lt;/strong&gt; and &lt;strong&gt;&lt;code&gt;LoadBalancer&lt;/code&gt;&lt;/strong&gt; services, uncovering how they handle external access, DNS resolution, and dynamic traffic distribution!
:::&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Terraform Associate Exam: A Powerful Guide about How to Prepare It</title><link>https://geekcoding101.com/posts/howto-prepare-terraform-associate</link><guid isPermaLink="true">https://geekcoding101.com/posts/howto-prepare-terraform-associate</guid><pubDate>Sun, 27 Apr 2025 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;img src=&quot;./passed-result.jpg&quot; alt=&quot;passed Terraform associate exam screenshot&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Today, I officially passed the &lt;a href=&quot;https://developer.hashicorp.com/terraform/tutorials/certification-003/associate-review-003&quot;&gt;&lt;strong&gt;HashiCorp Certified: Terraform Associate (003)&lt;/strong&gt;&lt;/a&gt; exam! 🚀&lt;/p&gt;
&lt;p&gt;It wasn’t hard — It&apos;s one hour exam and I finished in about &lt;strong&gt;40 minutes&lt;/strong&gt;, reviewed a few flagged questions, and then confidently submitted.&lt;/p&gt;
&lt;p&gt;Now, while I&apos;m parking the more advanced &lt;a href=&quot;https://developer.hashicorp.com/terraform/tutorials/pro-cert/pro-study&quot;&gt;&lt;strong&gt;HashiCorp Certified: Terraform Authoring &amp;amp; Operations Professional with AWS (HCTOP-002-AWS)&lt;/strong&gt;&lt;/a&gt; for the moment, my next mission is to tackle &lt;a href=&quot;https://training.linuxfoundation.org/certification/certified-kubernetes-administrator-cka/&quot;&gt;&lt;strong&gt;Certified Kubernetes Administrator (CKA)&lt;/strong&gt;&lt;/a&gt;.&lt;br /&gt;
After that, we’ll see whether I circle back to the Terraform Professional Level exam.&lt;/p&gt;
&lt;h1&gt;🔁 A Quick Rewind: The Journey&lt;/h1&gt;
&lt;p&gt;If you’ve seen my last two blog posts about Terraform, you might have guessed it —&lt;br /&gt;
👉 I actually booked the Terraform associate exam upfront, way before I even started the real preparation.&lt;/p&gt;
&lt;p&gt;I booked it on purpose to push myself — to create that real, no-turning-back deadline pressure.&lt;br /&gt;
Since then, I&apos;ve been squeezing in study time almost every day, balancing learning from Udemy courses, doing hands-on practice, and posting my Terraform learning journey right here on this blog.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Lesson?&lt;/strong&gt;&lt;br /&gt;
Setting a real goal date works. It forces you to move!&lt;/p&gt;
&lt;h2&gt;Terraform associate certification cost?&lt;/h2&gt;
&lt;p&gt;It&apos;s $70.5 as of today.&lt;/p&gt;
&lt;h2&gt;How long does terraform associate exam take?&lt;/h2&gt;
&lt;p&gt;One hour.&lt;/p&gt;
&lt;h2&gt;How many questions in terraform associate exam?&lt;/h2&gt;
&lt;p&gt;Total 57 questions.&lt;/p&gt;
&lt;h2&gt;What type of questions in Terraform associate exam?&lt;/h2&gt;
&lt;p&gt;Good question! The Terraform associate exam includes:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Multiple Choice (Single Answer)&lt;/li&gt;
&lt;li&gt;Multiple Choice (Multiple Answer)&lt;/li&gt;
&lt;li&gt;True/False&lt;/li&gt;
&lt;li&gt;Fill the blank&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;You can find the sample questions at &lt;a href=&quot;https://developer.hashicorp.com/terraform/tutorials/certification-003/associate-questions&quot;&gt;HashiCorp official site here&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Is the &lt;strong&gt;Terraform Associate Certification&lt;/strong&gt; worth it?&lt;/h2&gt;
&lt;p&gt;In short: &lt;strong&gt;absolutely, yes — if you work with cloud infrastructure&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Terraform has become the industry standard for Infrastructure as Code (IaC), and getting certified shows you understand not just the basic commands, but the &lt;em&gt;right&lt;/em&gt; way to manage infrastructure at scale — including modules, state management, workspaces, and advanced features like remote backends. The certification isn’t just a checkbox; it validates that you can design clean, reusable, and reliable Terraform configurations — a huge plus for DevOps, cloud engineers, and even platform architects.&lt;/p&gt;
&lt;p&gt;Especially the Professional Level exam are all hands-on format and take 3 hours to finish and it&apos;s quite challenging!&lt;/p&gt;
&lt;p&gt;If you&apos;re serious about leveling up in the cloud/DevOps world, I would recommend you to take it.&lt;/p&gt;
&lt;h1&gt;📚 Resources I Used to Prepare&lt;/h1&gt;
&lt;p&gt;Big shoutout to the two courses that shaped my Terraform associate exam preparation journey:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;🎯 &lt;strong&gt;Learn from official website of this exam &lt;a href=&quot;https://developer.hashicorp.com/terraform/tutorials/certification-003/associate-review-003&quot;&gt;at here&lt;/a&gt;.&lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;🎯 &lt;strong&gt;All in One course for learning Terraform and gaining the official Terraform Associate Certification (003)&lt;/strong&gt; by &lt;em&gt;Zeal Vora -&lt;/em&gt; 👉 &lt;a href=&quot;https://cohesity.udemy.com/course/terraform-beginner-to-advanced&quot;&gt;Check it out here&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;🎯 &lt;strong&gt;The Original Terraform Associate 003 Prep: Pass your Terraform cert with 300+ Questions with Explanations and Resources&lt;/strong&gt; by &lt;em&gt;Bryan Krausen -&lt;/em&gt; 👉 &lt;a href=&quot;https://cohesity.udemy.com/course/terraform-associate-practice-exam/&quot;&gt;Check it out here&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I also created my own error notebook and &lt;strong&gt;learning notes&lt;/strong&gt; during my preparation as you see in next paragraphs.&lt;/p&gt;
&lt;hr /&gt;
&lt;h1&gt;✍️ My Terraform Associate Notes and Takeaways&lt;/h1&gt;
&lt;p&gt;Here&apos;s a brain-dump of everything I noted down during my preparation — real, practical, exam-focused.&lt;/p&gt;
&lt;p&gt;Even you feel the Terraform Associate is an beginner level exam, I still highly recommend you to read through every point I noted here! You will find out that you will not just need them in Terraform Associate exam!&lt;/p&gt;
&lt;h2&gt;Key Terraform Concepts and Nuggets&lt;/h2&gt;
&lt;h3&gt;Main Notes&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Terraform Community CLI &lt;strong&gt;does not natively support VCS&lt;/strong&gt; (Version Control System) connections. You have to manually pull/push. But Terraform Cloud / HCP Terraform support it.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;There is a special block: &lt;code&gt;moved&lt;/code&gt; Tells Terraform a resource has changed address, without touching the actual infrastructure. Used for Renaming, restructuring resources or modules safely. Example:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;moved {
  from = aws_instance.old_name
  to   = aws_instance.new_name
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It tells Terraform:&lt;/p&gt;
&lt;p&gt;:::info
This old resource is now considered to be at this new address — &lt;strong&gt;DON&apos;T destroy and recreate it&lt;/strong&gt;.
:::&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Since Terraform 1.5, there is an &lt;code&gt;import&lt;/code&gt;  &lt;strong&gt;block&lt;/strong&gt;! &lt;strong&gt;Old way:&lt;/strong&gt; Run terraform import manually for each resource&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;terraform import &amp;lt;resource_type&amp;gt;.&amp;lt;resource_name&amp;gt; &amp;lt;real-world-ID&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;New way:&lt;/strong&gt; Declare imports alongside code:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;import {
  id = &quot;i-1234567890abcdef0&quot;
  to = aws_instance.example
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It tells Terraform:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Terraform will import the AWS EC2 instance with ID &lt;code&gt;i-1234567890abcdef0&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;And map it to the resource &lt;code&gt;aws_instance.example&lt;/code&gt; declared in your &lt;code&gt;.tf&lt;/code&gt; file.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;terraform console&lt;/code&gt; also will lock state file!&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;True or false: In Terraform Community, workspaces generally use the same code repository while workspaces in Terraform Enterprise/Cloud are often mapped to different code repositories. &lt;strong&gt;Answer: True.&lt;/strong&gt; In Terraform Community, workspaces typically share the same code repository, allowing multiple environments or configurations to be managed within the same repository. On the other hand, in Terraform Enterprise/Cloud, workspaces are often mapped to different code repositories to provide better isolation and organization for different projects or teams.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Object type can specify data type for &lt;strong&gt;each field&lt;/strong&gt;, but all values map type must be &lt;strong&gt;same type.&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;variable &quot;example_map&quot; {
  type = map(string)
}
&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;True or false:&lt;/strong&gt; Infrastructure as code (IaC) tools allow you to manage infrastructure with configuration files rather than through a graphical user interface. &lt;strong&gt;Answer: True&lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;provider&lt;/code&gt; block is not a must! Terraform can automatically detect and use providers based on the resource configurations defined in the code.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;terraform plan&lt;/code&gt; is not a must before &lt;code&gt;terraform apply&lt;/code&gt;!&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Run &lt;code&gt;terraform init&lt;/code&gt; successfully, then directly run &lt;code&gt;terraform apply&lt;/code&gt;, what would happen? It will scan target infrastructure, create new state file, then deploy.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Run &lt;code&gt;terraform init&lt;/code&gt;, then if removing the version line from module block, run &lt;code&gt;terraform init -upgrade&lt;/code&gt;, what would happen? Terraform &lt;strong&gt;WILL NOT&lt;/strong&gt; download latest version of modules! &lt;strong&gt;Terraform will use already downloaded version as Terraform cache it locally!&lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Run &lt;code&gt;terraform init -upgrade&lt;/code&gt;, what would happen? It will check and download latest version of plugins/modules complies with the configuration’s version constraints&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;If the backend hosting state does not supports state blocking, then two &lt;code&gt;terraform apply&lt;/code&gt; at the same might cause corruption in state file!&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;State file has one purpose to improve performance!&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;What does Terraform agents do? Execute plan and apply changes in infrastructure!&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;True or False:&lt;/strong&gt; Any sensitive values referenced in the Terraform code, even as variables, will end up in plain text in the state file. &lt;strong&gt;That’s TRUE!!!!&lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Same resources needs to have provider alias to&lt;/strong&gt; have different configuration! Like one AWS resource needs to be in east region, another AWS with alias can be in west region!&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The primary use of Infrastructure as Code (IaC)? &lt;strong&gt;The ability to programmatically deploy and configure resources&lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;True or False:&lt;/strong&gt; In both Terraform Community and HCP Terraform, workspaces provide similar functionality of &lt;strong&gt;using a separate state file for each workspace&lt;/strong&gt;. &lt;strong&gt;Answer: True&lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;For_each Value Referencing Table&lt;/h3&gt;
&lt;p&gt;Exhibited Code:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;variable &quot;env&quot; {
  type = map(any)
  default = {
    prod = {
      ip = &quot;10.0.150.0/24&quot;
      az = &quot;us-east-1a&quot;
    }
    dev = {
      ip = &quot;10.0.250.0/24&quot;
      az = &quot;us-east-1e&quot;
    }
  }
}
resource &quot;aws_subnet&quot; &quot;example&quot; {
  for_each          = var.env
  cidr_block        = each.value.ip
  availability_zone = each.value.az
  tags = {
    Name = &quot;subnet-${each.key}&quot;
  }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Keyword&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;each.key&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The map key (prod or dev)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;each.value&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The full object for that key&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;each.value.ip&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The IP address for that environment&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;each.value.az&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The availability zone for that environment&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;/p&gt;
&lt;h3&gt;📚 &lt;strong&gt;Terraform Golden Rule Mismatch Table&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;Well, I gave the name &quot;Golden Rule&quot; ^^&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Mismatch&lt;/th&gt;
&lt;th&gt;Terraform Reaction&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Missing in state but exists in config?&lt;/td&gt;
&lt;td&gt;Terraform plans to create it.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Exists in state but missing in config?&lt;/td&gt;
&lt;td&gt;Terraform plans to destroy it.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2&gt;More Terraform Pro Tips&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;The credentials to &lt;code&gt;aws&lt;/code&gt; are defined in &lt;code&gt;Provider&lt;/code&gt; block!&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;provider &quot;aws&quot; {
  region     = &quot;us-east-1&quot;
  access_key = &quot;YOUR_ACCESS_KEY&quot;
  secret_key = &quot;YOUR_SECRET_KEY&quot;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;No &lt;code&gt;name&lt;/code&gt; in &lt;code&gt;provider&lt;/code&gt; block, it’s using &lt;code&gt;alias&lt;/code&gt;. How to use the alias in resource?&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;provider &quot;aws&quot; {
  region = &quot;us-east-1&quot;
}
provider &quot;aws&quot; {
  alias  = &quot;west&quot;
  region = &quot;us-west-2&quot;
}
resource &quot;aws_instance&quot; &quot;example&quot; {
  provider      = aws.west  # use the aliased provider
  ami           = &quot;ami-0c55b159cbfafe1f0&quot;
  instance_type = &quot;t2.micro&quot;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Inside terraform block: &lt;code&gt;required_version&lt;/code&gt; - to constrain Terraform CLI version.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Backend&lt;/strong&gt; configuration must be defined &lt;strong&gt;inside the terraform block&lt;/strong&gt;. You cannot define a backend inside a provider block or outside the terraform block:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;terraform {
  backend &quot;remote&quot; {
    hostname = &quot;app.terraform.io&quot;
    organization = &quot;btk&quot;

    workspaces {
      name = &quot;bryan-prod&quot;
    }
  }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Constrain single or multiple provider version in &lt;code&gt;terraform&lt;/code&gt; block, &lt;strong&gt;None&lt;/strong&gt; of this &lt;strong&gt;can be in a provider block!!!&lt;/strong&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;terraform {
  required_providers {
    aws = {
      source  = &quot;hashicorp/aws&quot;
      version = &quot;~&amp;gt; 5.0&quot;
    }
    azurerm = {
      source = &quot;hashicorp/azurerm&quot;
      version = &quot;2.90.0&quot;
    }
  }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The pattern of source path if &lt;strong&gt;using private registry&lt;/strong&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;module &quot;vpc&quot; {
  source  = &quot;registry.example.com/devops-team/vpc-module/aws&quot;
  version = &quot;1.0.3&quot;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Below is on Terraform public registry!&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;module &quot;consul&quot; {
  source = &quot;hashicorp/consul/aws&quot;
  version = &quot;0.1.0&quot;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;References Value&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;th&gt;Reference Format&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Variable&lt;/td&gt;
&lt;td&gt;&lt;code&gt;var.&amp;lt;variable_name&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;var.instance_type&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Local&lt;/td&gt;
&lt;td&gt;&lt;code&gt;local.&amp;lt;local_name&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;local.default_tags&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Data Source&lt;/td&gt;
&lt;td&gt;&lt;code&gt;data.&amp;lt;provider&amp;gt;_&amp;lt;data_type&amp;gt;.&amp;lt;name&amp;gt;.&amp;lt;attribute&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;data.aws_ami.ubuntu.id&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Module Output&lt;/td&gt;
&lt;td&gt;&lt;code&gt;module.&amp;lt;module_name&amp;gt;.&amp;lt;output_name&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;module.network.vpc_id&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Resource Attribute&lt;/td&gt;
&lt;td&gt;&lt;code&gt;&amp;lt;provider&amp;gt;_&amp;lt;resource_type&amp;gt;.&amp;lt;resource_name&amp;gt;.&amp;lt;attribute&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;aws_instance.web.public_ip&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Terraform Built-in Functions&lt;/td&gt;
&lt;td&gt;&lt;code&gt;&amp;lt;function&amp;gt;(&amp;lt;arguments&amp;gt;)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;cidrsubnet(var.vpc_cidr, 8, 1)&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Terraform Meta-Arguments (special cases)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;self.&amp;lt;attribute&amp;gt;&lt;/code&gt; (within resource)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;self.public_ip&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;:::info
Define local is using &lt;code&gt;locals&lt;/code&gt;!
:::&lt;/p&gt;
&lt;h1&gt;✨ Closing Thoughts&lt;/h1&gt;
&lt;p&gt;Passing the &lt;strong&gt;Terraform Associate (003)&lt;/strong&gt; cert wasn&apos;t brutal — it just needed &lt;strong&gt;focused practice&lt;/strong&gt;, &lt;strong&gt;real deadlines&lt;/strong&gt;, and &lt;strong&gt;hands-on experience&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Next stop: &lt;strong&gt;CKA (Certified Kubernetes Administrator)&lt;/strong&gt;!&lt;br /&gt;
Maybe afterward, I&apos;ll resume the Terraform Pro-level certs — but for now, it’s Kubernetes grind time. 🚀&lt;/p&gt;
&lt;p&gt;:::info
Feel free to check out my previous posts about Terraform:&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;/posts/1-terraform-with-aws-iam-ec2&quot;&gt;Part 1: Mastering Terraform with AWS Guide Part 1: Launch Real AWS Infrastructure with VPC, IAM and EC2&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;/posts/terraform-meta-arguments&quot;&gt;Part 2: Terraform Meta Arguments Unlocked: Practical Patterns for Clean Infrastructure Code&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;:::&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Golang Range Loop Reference - Why Your Loop Keeps Giving You the Same Pointer (and How to Fix It)</title><link>https://geekcoding101.com/posts/golang-range-loop-reference</link><guid isPermaLink="true">https://geekcoding101.com/posts/golang-range-loop-reference</guid><pubDate>Mon, 05 May 2025 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;When I first started learning Go, I thought I was doing everything right—until I ran into a weird bug about golang range loop reference. I was iterating over a list of &lt;code&gt;Book&lt;/code&gt; structs (of course, I can&apos;t share the real structs and code used here... all here are for turorial purpose), taking the pointer of each one, and storing them into a slice. But at the end of the loop, all the pointers pointed to... the same book?! 🤯&lt;/p&gt;
&lt;p&gt;Let’s walk through this classic Go beginner mistake together — and fix it the right way.&lt;/p&gt;
&lt;h1&gt;📚 The Use Case: A Slice of Books in a Library&lt;/h1&gt;
&lt;p&gt;Suppose we have a list of books, and we want to collect pointers to each one so we can modify them later.&lt;/p&gt;
&lt;p&gt;Here’s the code I &lt;em&gt;thought&lt;/em&gt; would work:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;for _, book := range books {
    bookPointers = append(bookPointers, &amp;amp;book) // Oops...
}

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;But when I printed out the pointers, they all pointed to the &lt;em&gt;last&lt;/em&gt; book in the list. This bug stumped me for a while until I understood one critical Go behavior.&lt;/p&gt;
&lt;h1&gt;The File Structure To Run The Code&lt;/h1&gt;
&lt;pre&gt;&lt;code&gt;learning-golang/
├── 01-loop-reference-pitfall/
│   ├── main.go
│   └── README.md
├── Makefile
├── bin/
└── go.mod

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is the complete buggy code:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;package main

import (
    &quot;fmt&quot;
)

type Book struct {
    Title  string
    Author string
}

func main() {
    originalBooks := []Book{
        {&quot;Go in Action&quot;, &quot;William Kennedy&quot;},
        {&quot;The Go Programming Language&quot;, &quot;Alan Donovan&quot;},
        {&quot;Introducing Go&quot;, &quot;Caleb Doxsey&quot;},
    }

    fmt.Println(&quot;❌ Buggy Version:&quot;)
    var buggyPointers []*Book
    for _, book := range originalBooks {
        buggyPointers = append(buggyPointers, &amp;amp;book)
    }
    for _, bp := range buggyPointers {
        fmt.Printf(&quot;Title: %-30s | Address: %p\n&quot;, bp.Title, bp)
    }
}

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The Makefile:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# Usage:
#   make run DIR=01-loop-reference-pitfall
#   make build DIR=01-loop-reference-pitfall
#   make clean

GO=go

run:
    @echo &quot;👉 Running $(DIR)/main.go...&quot;
    cd $(DIR) &amp;amp;&amp;amp; $(GO) run main.go

build:
    @echo &quot;🔧 Building binary from $(DIR)/main.go...&quot;
    cd $(DIR) &amp;amp;&amp;amp; $(GO) build -o ../bin/$(notdir $(DIR))

clean:
    @echo &quot;🧹 Cleaning up built binaries...&quot;
    rm -rf bin/

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Build and Run The Code&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;❯ go mod init github.com/geekcoding101/learning-golang
❯ make run DIR=01-loop-reference-pitfall

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&quot;./first-time-build-and-run.jpg&quot; alt=&quot;Golang Range Loop Reference - First time build and run code&quot; title=&quot;Golang Range Loop Reference - First time build and run code&quot; /&gt;&lt;/p&gt;
&lt;p&gt;As you see, the Address didn&apos;t change at all!&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;🐛 The Problem: Reuses the Loop Variable in Golang Range Loop Reference&lt;/h2&gt;
&lt;p&gt;In Go, when you do:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;for _, book := range books
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;book&lt;/code&gt; variable is reused in every iteration. It&apos;s not a new instance each time. So taking &lt;code&gt;&amp;amp;book&lt;/code&gt; actually gives you the &lt;em&gt;same memory address&lt;/em&gt; over and over.&lt;/p&gt;
&lt;p&gt;This means every pointer in the slice is just pointing to the same memory location, which at the end holds the value of the &lt;strong&gt;last book&lt;/strong&gt;.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;✅ The Fix: Indexing Directly&lt;/h2&gt;
&lt;p&gt;The correct way is to use an index:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;fmt.Println(&quot;\n✅ Fixed Version:&quot;)
var fixedPointers []*Book
for i := range originalBooks {
    fixedPointers = append(fixedPointers, &amp;amp;originalBooks[i])
}
for _, bp := range fixedPointers {
    fmt.Printf(&quot;Title: %-30s | Address: %p\n&quot;, bp.Title, bp)
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now, each pointer actually refers to the corresponding element in the original slice. Problem solved!&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./second-time-with-fix-build-and-run.jpg&quot; alt=&quot;second time with fix build and run&quot; title=&quot;second time with fix build and run&quot; /&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;h1&gt;💻 Real Code Example on GitHub&lt;/h1&gt;
&lt;p&gt;I&apos;ve documented this bug and fix in my GitHub repository:&lt;/p&gt;
&lt;p&gt;👉 &lt;a href=&quot;https://github.com/geekcoding101/learning-golang&quot;&gt;github.com/geekcoding101/learning-golang&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Here&apos;s what you’ll find:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;The buggy version (with all pointers pointing to the same book)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The fixed version (each pointer is correct)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;A &lt;code&gt;Makefile&lt;/code&gt; to help you run and build each learning topic&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h1&gt;✍️ Follow My Golang Tutorials&lt;/h1&gt;
&lt;p&gt;I’ll continue sharing these hands-on lessons as I deepen my understanding of Go.&lt;/p&gt;
&lt;p&gt;Check out my blog 👉 &lt;a href=&quot;https://www.geekcoding101.com&quot;&gt;www.geekcoding101.com&lt;/a&gt; — where I share practical posts, breakdowns, and real-world insights from my coding journey.&lt;/p&gt;
&lt;hr /&gt;
&lt;h1&gt;📌 Quick Hashtag&lt;/h1&gt;
&lt;p&gt;#Golang Range Loop Reference, #for loop golang range, #for loop range golang, #golang for range loop, #for loop in golang with range, #go slice reference, #go for loop pointer trap, #why does my go loop store the same pointer, #golang how to correctly get pointer from loop, #go for loop pointer always same&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>A 12 Factor Crash Course in Python: Build Clean, Scalable FastAPI Apps the Right Way</title><link>https://geekcoding101.com/posts/12-factor-crash-course</link><guid isPermaLink="true">https://geekcoding101.com/posts/12-factor-crash-course</guid><pubDate>Mon, 12 May 2025 00:00:00 GMT</pubDate><content:encoded>&lt;h1&gt;&lt;img src=&quot;./feature-image-2.jpg&quot; alt=&quot;a 12-factor app crash course&quot; title=&quot;a 12-factor app crash course&quot; /&gt;&lt;/h1&gt;
&lt;h1&gt;Intro: Building Apps That Don’t Suck in Production&lt;/h1&gt;
&lt;p&gt;Let’s be honest—plenty of apps “work on my machine” but self-destruct the moment they meet the real world. Configs hardcoded, logs missing, environments confused, and deployments that feel like an escape room puzzle.&lt;/p&gt;
&lt;p&gt;If you want your service to thrive in production (and not become an ops horror story), you need a design philosophy that enforces &lt;strong&gt;clean separation, modularity, and resilience&lt;/strong&gt;. That&apos;s where the &lt;strong&gt;12 Factor App&lt;/strong&gt; methodology comes in.&lt;/p&gt;
&lt;p&gt;In this post, we’re going to break down &lt;strong&gt;each of the 12 Factor&lt;/strong&gt; using a Python/FastAPI related stack—and walk through how to get them right.&lt;/p&gt;
&lt;h1&gt;🧱 The Twelve Factor — Python Style&lt;/h1&gt;
&lt;p&gt;Let’s take each principle, one by one. Think of it as a devops dojo, with Python as your katana.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;&lt;strong&gt;Codebase: One codebase tracked in revision control, many deploys&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;12 Factor App: Single source of truth, version-controlled, no Franken-repos.&lt;/p&gt;
&lt;p&gt;📌 &lt;strong&gt;In Python:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;One Git repo per service.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Don&apos;t share code across projects via copy-paste. Use internal packages or shared libraries (published to private PyPI or via Git submodules).&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;✅ &lt;strong&gt;Best Practice:&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;/fastapi-12factor-app
├── app/
│   ├── api/
│   ├── core/
│   ├── models/
│   └── main.py
├── tests/
├── Dockerfile
├── pyproject.toml
├── README.md
└── .env

&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;&lt;strong&gt;Dependencies: Explicitly declare and isolate dependencies&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;12 Factor App: No implicit magic. Use virtualenvs and lock your deps.&lt;/p&gt;
&lt;p&gt;📌 &lt;strong&gt;In Python:&lt;/strong&gt; Use &lt;a href=&quot;https://peps.python.org/pep-0621/&quot;&gt;&lt;code&gt;pyproject.toml&lt;/code&gt;&lt;/a&gt; and a tool like &lt;strong&gt;Poetry&lt;/strong&gt; or &lt;strong&gt;pip-tools&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;✅ &lt;strong&gt;Example &lt;code&gt;pyproject.toml&lt;/code&gt;:&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[tool.poetry.dependencies]
python = &quot;^3.12&quot;
fastapi = &quot;^0.110.0&quot;
uvicorn = &quot;^0.29.0&quot;
sqlalchemy = &quot;^2.0&quot;
pydantic = &quot;^2.6&quot;
python-dotenv = &quot;^1.0&quot;

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;🔒 Lock it down:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;poetry lock

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And run your app in a containerized environment, so your coworker’s Python 3.6 setup doesn’t eat your soul.&lt;/p&gt;
&lt;h2&gt;&lt;strong&gt;Config: Store config in the environment&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;Configs aren’t code. Environment variables FTW.&lt;/p&gt;
&lt;p&gt;📌 &lt;strong&gt;In Python with Pydantic v2:&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;from pydantic_settings import BaseSettings

class Settings(BaseSettings):
    database_url: str
    debug: bool = False

    class Config:
        env_file = &quot;.env&quot;

settings = Settings()

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;✅ &lt;code&gt;.env&lt;/code&gt; for local:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;DATABASE_URL=postgresql+asyncpg://user:pass@db:5432/app
DEBUG=true

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;🚀 Let Kubernetes inject real env vars in prod. No secrets in code, please.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;&lt;strong&gt;Backing Services: Treat backing services as attached resources&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;12 Factor App: Databases, queues, and blobs should be replaceable.&lt;/p&gt;
&lt;p&gt;📌 &lt;strong&gt;In FastAPI:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Define your database URL in &lt;code&gt;settings.database_url&lt;/code&gt;, not hardcoded. SQLAlchemy supports this beautifully.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;from sqlalchemy.ext.asyncio import create_async_engine

engine = create_async_engine(settings.database_url, echo=settings.debug)

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;🧪 In test, you can override &lt;code&gt;DATABASE_URL&lt;/code&gt; with a SQLite memory DB. That’s the power of this separation.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;&lt;strong&gt;Build, Release, Run: Strictly separate build and run stages&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;12 Factor App: Immutable images. Don’t change code/configs post-build.&lt;/p&gt;
&lt;p&gt;📦 &lt;strong&gt;Dockerfile&lt;/strong&gt; example:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;FROM python:3.12-slim

WORKDIR /app
COPY pyproject.toml .
RUN pip install poetry &amp;amp;&amp;amp; poetry install --no-dev

COPY . .

CMD [&quot;uvicorn&quot;, &quot;app.main:app&quot;, &quot;--host&quot;, &quot;0.0.0.0&quot;, &quot;--port&quot;, &quot;8000&quot;]

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;👊 Don’t inject secrets during build—use &lt;code&gt;env&lt;/code&gt; at runtime.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;&lt;strong&gt;Processes: Execute the app as one or more stateless processes&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;12 Factor App: Stateless, share-nothing services.&lt;/p&gt;
&lt;p&gt;📌 &lt;strong&gt;In FastAPI:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Keep state (like DB sessions) outside the app object.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Use dependency injection for scoped connections.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy.orm import sessionmaker

async_session = sessionmaker(engine, expire_on_commit=False, class_=AsyncSession)

async def get_session() -&amp;gt; AsyncSession:
    async with async_session() as session:
        yield session

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This plays nice with Kubernetes autoscaling and kills zombie state.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;&lt;strong&gt;Port Binding: Export services via port binding&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;12 Factor App: Your app should be self-contained and listen on a port.&lt;/p&gt;
&lt;p&gt;✅ FastAPI does this naturally:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;uvicorn app.main:app --host 0.0.0.0 --port 8000

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;K8s service can bind this to external ports as needed. No Apache/Nginx glue required.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;&lt;strong&gt;Concurrency: Scale out via the process model&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;12 Factor App: Scale horizontally, not by making megathreads.&lt;/p&gt;
&lt;p&gt;📌 Use &lt;strong&gt;Uvicorn workers&lt;/strong&gt; via &lt;code&gt;gunicorn&lt;/code&gt; if needed, or just scale pods in K8s:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;gunicorn -k uvicorn.workers.UvicornWorker app.main:app -w 4
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Or define a HorizontalPodAutoscaler in K8s—clean separation.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;&lt;strong&gt;Disposability: Fast startup and graceful shutdown&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;12 Factor App: Apps should start/stop fast and cleanly.&lt;/p&gt;
&lt;p&gt;✅ In FastAPI, use startup/shutdown events:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;from fastapi import FastAPI

app = FastAPI()

@app.on_event(&quot;startup&quot;)
async def on_startup():
    print(&quot;Ready to go!&quot;)

@app.on_event(&quot;shutdown&quot;)
async def on_shutdown():
    print(&quot;Shutting down gracefully...&quot;)

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Kubernetes will send SIGTERM—be ready for it.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;&lt;strong&gt;Dev/Prod Parity: Keep development, staging, and production as similar as possible&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;📌 Use &lt;code&gt;.env&lt;/code&gt; for local, ConfigMaps/Secrets for prod, but same app code.&lt;/p&gt;
&lt;p&gt;Also—use &lt;strong&gt;Docker&lt;/strong&gt; for dev, same as prod. Don’t “just run it on the host.”&lt;/p&gt;
&lt;p&gt;✅ Use &lt;code&gt;docker-compose&lt;/code&gt; in dev (or Tilt/Skaffold) to mirror the prod infra.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;&lt;strong&gt;Logs: Treat logs as event streams&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;12 Factor App: Don’t write to files. Stream to stdout/stderr.&lt;/p&gt;
&lt;p&gt;✅ FastAPI + &lt;code&gt;logging&lt;/code&gt; setup:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

@app.get(&quot;/health&quot;)
async def health():
    logger.info(&quot;Health check called&quot;)
    return {&quot;status&quot;: &quot;ok&quot;}

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;🎯 Let Kubernetes + Fluentd/ELK/Grafana Loki deal with aggregation.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;&lt;strong&gt;Admin Processes: Run admin/one-off tasks as one-off processes&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;12 Factor App: ✅ Create a separate &lt;code&gt;scripts/&lt;/code&gt; dir with admin tasks (DB migrations, data cleaning, etc.)&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;/scripts/
  └── migrate.py

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Run it as:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;python scripts/migrate.py

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Or use K8s Jobs for one-offs in production.&lt;/p&gt;
&lt;h1&gt;Cheatsheet&lt;/h1&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Factor&lt;/th&gt;
&lt;th&gt;Applies To&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Codebase&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;All apps&lt;/td&gt;
&lt;td&gt;One codebase per app, tracked in version control, with many deploys.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Dependencies&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Language/runtime&lt;/td&gt;
&lt;td&gt;Explicitly declare and isolate dependencies via a manifest (e.g., &lt;code&gt;pyproject.toml&lt;/code&gt;).&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Config&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Environment management&lt;/td&gt;
&lt;td&gt;Store config in environment variables; never in code.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Backing Services&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Databases, queues, caches&lt;/td&gt;
&lt;td&gt;Treat services like resources; attach/detach them via config, not code changes.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Build, Release, Run&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;CI/CD pipelines&lt;/td&gt;
&lt;td&gt;Separate build, release, and run stages. Never change code/config after release.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Processes&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Application execution&lt;/td&gt;
&lt;td&gt;Execute apps as stateless processes; share nothing, scale horizontally.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Port Binding&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Web services&lt;/td&gt;
&lt;td&gt;Export services via port binding; don’t depend on external web servers.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Concurrency&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Scalability&lt;/td&gt;
&lt;td&gt;Scale out via process model; use multiple instances or pods, not threads.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Disposability&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Lifecycle management&lt;/td&gt;
&lt;td&gt;Fast startup and graceful shutdown improve robustness and scalability.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Dev/Prod Parity&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Dev environments&lt;/td&gt;
&lt;td&gt;Keep development, staging, and production as similar as possible.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Logs&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Observability&lt;/td&gt;
&lt;td&gt;Treat logs as event streams; write to stdout/stderr and let the platform handle aggregation.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Admin Processes&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;One-off tasks&lt;/td&gt;
&lt;td&gt;Run one-off admin tasks (e.g., migrations) as isolated processes, not part of the main app.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h1&gt;🔚 Wrapping It All Up&lt;/h1&gt;
&lt;p&gt;The 12 Factor App methodology isn’t just a checklist—it’s a &lt;strong&gt;survivability manual&lt;/strong&gt; for cloud-native apps. And FastAPI, paired with Pydantic v2 and SQLAlchemy, makes following these principles refreshingly clean.&lt;/p&gt;
&lt;p&gt;A few takeaways:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Treat config like royalty—never hardcode it.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Keep your app stateless and dumb—let Kubernetes do the smart scaling.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Stream your logs, don&apos;t hoard them.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Build once, deploy often, break never (hopefully).&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;:::info
If you want to check more about engineering notes and system design, feel free to visit the tags at &lt;a href=&quot;/tags&quot;&gt;here&lt;/a&gt;
:::&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Kubernetes Control Plane Components Explained</title><link>https://geekcoding101.com/posts/kubernetes-control-plane-components</link><guid isPermaLink="true">https://geekcoding101.com/posts/kubernetes-control-plane-components</guid><pubDate>Sun, 18 May 2025 00:00:00 GMT</pubDate><content:encoded>&lt;h1&gt;Intro: So… What Powers the Kubernetes Brain?&lt;/h1&gt;
&lt;p&gt;Everyone loves to show off their YAML-fu and talk about Pods and Deployments, but what’s actually &lt;em&gt;running&lt;/em&gt; behind the scenes? What&apos;s keeping track of all your services, secrets, and scheduled chaos? Today I&apos;d like to bring your a quick introduction about the &lt;strong&gt;Kubernetes control plane components&lt;/strong&gt;—the brains of the operation. It’s made of several server-side components that work together like an orchestra of background daemons with trust issues and strict roles.&lt;/p&gt;
&lt;p&gt;In this post, we’ll demystify &lt;strong&gt;each core server&lt;/strong&gt; running in a Kubernetes control plane:&lt;br /&gt;
✅ &lt;code&gt;etcd&lt;/code&gt;&lt;br /&gt;
✅ &lt;code&gt;kube-apiserver&lt;/code&gt;&lt;br /&gt;
✅ &lt;code&gt;kube-scheduler&lt;/code&gt;&lt;br /&gt;
✅ &lt;code&gt;kube-controller-manager&lt;/code&gt;&lt;br /&gt;
✅ &lt;code&gt;cloud-controller-manager&lt;/code&gt;&lt;br /&gt;
✅ &lt;code&gt;kubelet&lt;/code&gt;&lt;br /&gt;
✅ &lt;code&gt;kube-proxy&lt;/code&gt;&lt;br /&gt;
✅ &lt;code&gt;coredns&lt;/code&gt;&lt;br /&gt;
✅ Optional players (metrics-server, CRI, CSI, etc.)&lt;/p&gt;
&lt;p&gt;Let’s start with the foundation: how these pieces talk.&lt;/p&gt;
&lt;hr /&gt;
&lt;h1&gt;🧩 Kubernetes Architecture in a Nutshell&lt;/h1&gt;
&lt;p&gt;Kubernetes has two main types of nodes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Control Plane Nodes&lt;/strong&gt; (aka “masters”): run the brain (scheduler, API, etc.)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Worker Nodes&lt;/strong&gt;: run your actual workloads (pods)&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;
&lt;p&gt;Think of the control plane as &lt;strong&gt;mission control&lt;/strong&gt; and the worker nodes as &lt;strong&gt;spacecraft&lt;/strong&gt;. One issues orders, the other executes.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;They talk over HTTP/gRPC and communicate securely via TLS.&lt;/p&gt;
&lt;p&gt;Now let’s break down the core components—what they are, what they do, and how they fit together.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./ascii-art-diagram.jpg&quot; alt=&quot;Kubernetes Control Plane Components ascii art diagram&quot; title=&quot;Kubernetes Control Plane Components ascii art diagram&quot; /&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;🔐 1. Kubernetes Control Plane Components - etcd&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt;&lt;br /&gt;
A distributed key-value store used to store &lt;em&gt;all&lt;/em&gt; cluster data: objects, state, configs.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Think of it as:&lt;/strong&gt;&lt;br /&gt;
The &lt;strong&gt;brain’s hard disk&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What lives inside etcd?&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Pod definitions&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;ConfigMaps&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Secrets&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Cluster state&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Node info&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;RoleBindings, CRDs, everything&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Backed by:&lt;/strong&gt;&lt;br /&gt;
&lt;a href=&quot;https://etcd.io/&quot;&gt;etcd&lt;/a&gt; (from CoreOS), written in Go, uses the Raft consensus algorithm for HA and consistency.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Why it matters:&lt;/strong&gt;&lt;br /&gt;
You &lt;em&gt;never&lt;/em&gt; talk to etcd directly. The &lt;strong&gt;API server&lt;/strong&gt; does only.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;🧭 2. Kubernetes Control Plane Components - kube-apiserver&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt;&lt;br /&gt;
The main RESTful API that all clients (kubectl, controllers, kubelet) talk to.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Think of it as:&lt;/strong&gt;&lt;br /&gt;
The &lt;strong&gt;gatekeeper&lt;/strong&gt; and &lt;strong&gt;translator&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Responsibilities:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Validates incoming requests&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Authenticates + authorizes them&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Talks to etcd to read/write state&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Notifies other components via the Watch API&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;It’s stateless&lt;/strong&gt;, and often run behind a load balancer for HA.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;⏱ 3. Kubernetes Control Plane Components - kube-scheduler&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt;&lt;br /&gt;
Assigns unscheduled pods to nodes, based on constraints.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Think of it as:&lt;/strong&gt;&lt;br /&gt;
The &lt;strong&gt;Tinder for workloads&lt;/strong&gt;. It matches pods to nodes based on:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;CPU/memory availability&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Node taints/tolerations&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Affinity rules&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Pod priority&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Custom scoring plugins&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;The flow:&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;A pod is created with no &lt;code&gt;nodeName&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;API server stores it in etcd.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Scheduler sees it, scores possible nodes.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;It picks a winner and updates the pod with &lt;code&gt;nodeName&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;hr /&gt;
&lt;h2&gt;🧙 4. Kubernetes Control Plane Components - kube-controller-manager&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt;&lt;br /&gt;
A daemon that runs &lt;strong&gt;many controllers&lt;/strong&gt; in one binary.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Think of it as:&lt;/strong&gt;&lt;br /&gt;
The &lt;strong&gt;cluster babysitter&lt;/strong&gt;—constantly checking for drift and correcting it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Runs controllers for:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Deployments&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Replicasets&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Nodes (watching for crashes)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Endpoints&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Persistent Volumes&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Certificates&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;How it works:&lt;/strong&gt;&lt;br /&gt;
Each controller watches a specific object type via the API server and ensures the desired state matches reality. If not, it acts.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;☁️ 5. Kubernetes Control Plane Components - cloud-controller-manager&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt;&lt;br /&gt;
An optional component for clusters running on public clouds (AWS, GCP, Azure).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Think of it as:&lt;/strong&gt;&lt;br /&gt;
The &lt;strong&gt;bridge between Kubernetes and your cloud infrastructure&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Responsibilities:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Creating LoadBalancers&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Managing cloud-based node info&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Attaching volumes (via CSI)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Updating routes&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You won’t see this in bare-metal clusters unless you set up external integrations manually.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;🧍 6. Kubernetes Control Plane Components - kubelet&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt;&lt;br /&gt;
A daemon that runs on &lt;em&gt;every&lt;/em&gt; worker node.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Think of it as:&lt;/strong&gt;&lt;br /&gt;
The &lt;strong&gt;node’s supervisor&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What it does:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Watches for pods assigned to its node&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Pulls container images via the CRI&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Starts/stops containers using container runtimes (containerd, CRI-O)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Sends status updates back to API server&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Important:&lt;/strong&gt;&lt;br /&gt;
kubelet does &lt;em&gt;not&lt;/em&gt; manage containers you started outside of Kubernetes.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;🌐 7. Kubernetes Control Plane Components - kube-proxy&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt;&lt;br /&gt;
Runs on each node to implement &lt;strong&gt;service networking&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Think of it as:&lt;/strong&gt;&lt;br /&gt;
A &lt;strong&gt;port-forwarding ninja&lt;/strong&gt; that handles cluster IP routing rules.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Modes:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;iptables (legacy, still common)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;ipvs (newer, faster, uses Linux’s IPVS)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;eBPF (if you’re fancy and using Cilium)&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;What it does:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Maintains network rules to forward traffic to the correct pod behind a Service&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Enables internal DNS (via CoreDNS)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Helps implement ClusterIP, NodePort, LoadBalancer behavior&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h2&gt;🧠 8. Kubernetes Control Plane Components - CoreDNS&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt;&lt;br /&gt;
Default DNS service in Kubernetes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Think of it as:&lt;/strong&gt;&lt;br /&gt;
Your cluster’s internal &lt;strong&gt;phone book&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What it does:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Resolves pod/service names to cluster IPs&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Works with kube-dns-compatible tools&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Responds to all &lt;code&gt;*.svc.cluster.local&lt;/code&gt; domain lookups&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Runs as a Deployment + Service&lt;/strong&gt;, just like any other app—because Kubernetes eats its own dog food.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;🛠 Supporting Components&lt;/h2&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;metrics-server&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Collects CPU/mem stats for autoscaling (HPA)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;CRI runtimes&lt;/strong&gt; (containerd, CRI-O)&lt;/td&gt;
&lt;td&gt;Interface between kubelet and container engines&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;CSI drivers&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Handle volume mounting for storage backends&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;CNI plugins&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Provide networking (Calico, Flannel, Cilium, etc.)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Admission controllers&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;API gatekeepers that enforce rules (e.g., resource limits, policies)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;API Aggregation layer&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Supports extending the Kubernetes API (like metrics.k8s.io)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr /&gt;
&lt;h1&gt;📦 Kubernetes Control Plan Components Diagram&lt;/h1&gt;
&lt;pre&gt;&lt;code&gt;+--------------------------------------------------+
|                  Control Plane                   |
| +-----------------+      +--------------------+  |
| | kube-apiserver  | &amp;lt;--&amp;gt; |  etcd              |  |
| +-----------------+      +--------------------+  |
|        ^                                         |
|        |                                         |
| +----------------------------+                   |
| |  kube-controller-manager   |                   |
| +----------------------------+                   |
| +----------------------------+                   |
| |       kube-scheduler       |                   |
| +----------------------------+                   |
| +----------------------------+                   |
| | cloud-controller-manager   |                   |
| |       (optional)           |                   |
| +----------------------------+                   |
+--------------------------------------------------+

+-------------------------------------------------+
|                Worker Node                      |
| +---------+  +-----------+  +-----------------+ |
| | kubelet |  | kube-proxy|  | containerd/CRI  | |
| +---------+  +-----------+  +-----------------+ |
| +---------------------------------------------+ |
| |                Pods (your app)              | |
| +---------------------------------------------+ |
+-------------------------------------------------+

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I really like this simplifed diagram!&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;🧠 A Quick Reference&lt;/h2&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;etcd&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Key-value store (source of truth)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;kube-apiserver&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Cluster API gateway&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;kube-scheduler&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Assigns pods to nodes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;kube-controller-manager&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Maintains cluster state&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;cloud-controller-manager&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Connects to cloud provider&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;kubelet&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Manages node and runs pods&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;kube-proxy&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Handles networking and routing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;CoreDNS&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Resolves internal service names&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;:::info
You&apos;re on a roll! Don&apos;t stop now—check out the full series and level up your Kubernetes skills. Each post builds on the last, so make sure you haven’t missed anything! 👇&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/kubernetes-tutorial-part1&quot;&gt;Part 1&lt;/a&gt;&lt;/strong&gt;, I laid out the &lt;strong&gt;networking plan&lt;/strong&gt;, my &lt;strong&gt;goals for setting up Kubernetes&lt;/strong&gt;, and how to &lt;strong&gt;prepare a base VM image&lt;/strong&gt; for the cluster.&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/tutorial-part2-dns-server-ntp&quot;&gt;Part 2&lt;/a&gt;&lt;/strong&gt;, I walked through &lt;strong&gt;configuring a local DNS server and NTP server&lt;/strong&gt;, essential for stable name resolution and time synchronization across nodes locally. These foundational steps will make our Kubernetes setup smoother&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/part3-kubernetes-cluster-setup&quot;&gt;Part 3&lt;/a&gt;&lt;/strong&gt;, I finished the Kubernetes cluster setup with Flannel, got one Kubernetes master and 4 worker nodes that’s ready for real workloads.&lt;/p&gt;
&lt;p&gt;🚀 In &lt;strong&gt;&lt;a href=&quot;/posts/part3-kubernetes-cluster-setup&quot;&gt;Part 4&lt;/a&gt;&lt;/strong&gt;, I explored NodePort and ClusterIP,understood the key differences, use cases, and when to choose each for internal and external service access!🔥&lt;/p&gt;
&lt;p&gt;🚀 In Part 5, Current one. I dived into &lt;code&gt;ExternalName&lt;/code&gt; and &lt;code&gt;LoadBalancer&lt;/code&gt; services, uncovering how they handle external access, DNS resolution, and dynamic traffic distribution!
:::&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Secure by Design Part 1: STRIDE Threat Modeling Explained</title><link>https://geekcoding101.com/posts/stride-threat-modeling-explained</link><guid isPermaLink="true">https://geekcoding101.com/posts/stride-threat-modeling-explained</guid><pubDate>Mon, 02 Jun 2025 00:00:00 GMT</pubDate><content:encoded>&lt;h2&gt;Intro: Why Every App Needs Threat Modeling And Why STRIDE&lt;/h2&gt;
&lt;p&gt;I’ve been meaning to write this post for a long time. Not because &lt;strong&gt;STRIDE Threat Modeling&lt;/strong&gt; are the hottest buzzwords in cybersecurity—they aren’t. And not because threat modeling is some shiny new technique—it’s not. But because &lt;strong&gt;if you’re building or defending any system—especially something as deceptively simple as a chat app—threat modeling is non-negotiable&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;:::info
Check out this cool one page &lt;a href=&quot;https://www.threatmodelingmanifesto.org/&quot;&gt;Threat modeling manifesto&lt;/a&gt;
:::&lt;/p&gt;
&lt;p&gt;Whether you&apos;re knee-deep in &lt;strong&gt;SecOps&lt;/strong&gt;, defining &lt;strong&gt;IAM policies&lt;/strong&gt;, tuning your &lt;strong&gt;SIEM&lt;/strong&gt;, or crafting &lt;strong&gt;detection logic&lt;/strong&gt;, you’ve got one mission: protect the stuff that matters. That means user data, privacy, service uptime, and reputation and so on. And if we don&apos;t design with threats in mind, we&apos;re just building breach bait with good intentions.&lt;/p&gt;
&lt;p&gt;So why STRIDE?&lt;/p&gt;
&lt;p&gt;Because &lt;strong&gt;STRIDE gives us a practical lens to view risk before the attacker does&lt;/strong&gt;. Instead of reacting to CVEs or chasing zero-days, STRIDE helps you think like a malicious actor while you’re still sketching your architecture in a whiteboard session or writing that controller code.&lt;/p&gt;
&lt;p&gt;In this post, I am going to use STRIDE threat modeling to walk through a seemingly simple application—a &lt;strong&gt;chat app&lt;/strong&gt;—and uncover the kinds of security holes that quietly turn into breach reports. You’ll see just how quickly things go sideways when we forget to ask, &lt;em&gt;“What could go wrong here?”&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;But first, let&apos;s talk about the app we&apos;re modeling.&lt;/p&gt;
&lt;h2&gt;Our Target: A Chat App&lt;/h2&gt;
&lt;p&gt;Let’s keep it humble. No machine learning, no blockchain, no AI buzzwords glued onto CRUD. Just a &lt;strong&gt;straightforward web-based chat application&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Here’s what it does:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;User Registration:&lt;/strong&gt; Email + password&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Login System:&lt;/strong&gt; Username/password auth, session cookies&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;User Directory:&lt;/strong&gt; Displays online users&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;1:1 Messaging:&lt;/strong&gt; Users can send and receive messages&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Message History:&lt;/strong&gt; Stored and retrievable&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Admin Panel:&lt;/strong&gt; Hidden route, unknown to regular users&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Now, this setup probably feels familiar. It’s the backbone of a thousand hackathons and product MVPs. But here’s the truth: &lt;strong&gt;simple apps are hacker candy&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Why? Because developers often make the same assumptions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;“It’s just a prototype.”&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;“Who would even try to attack this?”&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;“We’ll add security later.”&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Later never comes&lt;/strong&gt;. And these &quot;low-risk&quot; features? They can become pivot points for privilege escalation, data leaks, or full compromise. One misconfigured route or weak endpoint can become your next Incident Report ticket.&lt;/p&gt;
&lt;p&gt;So before we start breaking things (in Part 2), let’s apply &lt;strong&gt;STRIDE threat modeling&lt;/strong&gt;—a time-tested threat modeling framework from Microsoft—to map out &lt;strong&gt;what could go wrong&lt;/strong&gt; across this app’s lifecycle.&lt;/p&gt;
&lt;p&gt;Next stop: breaking down each of the six STRIDE categories and how they apply to this seemingly innocent app.&lt;/p&gt;
&lt;h2&gt;STRIDE: A Bit of History, Tools, and Fun Facts&lt;/h2&gt;
&lt;p&gt;Before we tear our chat app apart threat by threat, it’s worth pausing to talk about where STRIDE came from—and why it’s still standing strong in today’s security architecture playbook.&lt;/p&gt;
&lt;h3&gt;Where Did STRIDE Threat Modeling Come From?&lt;/h3&gt;
&lt;p&gt;STRIDE was developed by &lt;strong&gt;Microsoft&lt;/strong&gt; in the early 2000s, as part of their &lt;strong&gt;Trustworthy Computing Initiative&lt;/strong&gt;—yes, that era when Windows XP was everyone’s favorite backdoor. 😅&lt;/p&gt;
&lt;p&gt;The goal? Give developers and architects a lightweight, repeatable way to ask:&lt;br /&gt;
&lt;strong&gt;“What could go wrong here?”&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Rather than just tossing in a firewall and calling it a day, STRIDE forced teams to &lt;strong&gt;think in terms of threat categories&lt;/strong&gt;—not just patches and alerts. It came bundled into the &lt;strong&gt;Microsoft SDL (Security Development Lifecycle)&lt;/strong&gt; and has been a part of secure-by-design processes ever since.&lt;/p&gt;
&lt;p&gt;And you know what? It still holds up, especially in a world dominated by microservices, APIs, cloud, and third-party integrations.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;Tools That Support STRIDE Threat Modeling&lt;/h3&gt;
&lt;p&gt;You don’t have to scribble on whiteboards or use napkins (though, respect if you do). Here are a few tools to actually &lt;strong&gt;implement STRIDE modeling&lt;/strong&gt; in your workflows:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Good For&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;OWASP Threat Dragon&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Open-source threat modeling tool with STRIDE templates&lt;/td&gt;
&lt;td&gt;Visual modeling, diagrams&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Microsoft Threat Modeling Tool&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Free tool from Microsoft for STRIDE-based modeling&lt;/td&gt;
&lt;td&gt;Deep STRIDE threat modeling templates, flow diagrams&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IriusRisk&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Paid tool for automated threat modeling and compliance mapping&lt;/td&gt;
&lt;td&gt;Enterprise threat modeling at scale&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Draw.io + STRIDE Cards&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;DIY visual modeling using STRIDE cards&lt;/td&gt;
&lt;td&gt;Lightweight teams, whiteboard replacements&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;Pro Tip: If you already use architecture diagrams in tools like Lucidchart or Miro, &lt;strong&gt;just layer STRIDE annotations on top&lt;/strong&gt;. It’s easier than reinventing the wheel with a new platform.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;STRIDE Breakdown – Mapping Threats to Chat App Feature&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;STRIDE&lt;/strong&gt; stands for &lt;strong&gt;Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, and Elevation of Privilege&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;./STRIDE.jpg&quot; alt=&quot;STRIDE threat modeling matrix&quot; title=&quot;STRIDE threat modeling matrix&quot; /&gt;&lt;/p&gt;
&lt;p&gt;It&apos;s the OG threat modeling framework for secure-by-design thinking.&lt;/p&gt;
&lt;p&gt;For each category, we’ll look at:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;What it means&lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Real-world relevance&lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;How it applies to our chat app&lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;How to detect it&lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;How to prevent or mitigate&lt;/strong&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Let’s dive in.&lt;/p&gt;
&lt;hr /&gt;
&lt;h3&gt;S – Spoofing Identity&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;What It Means&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Spoofing is about pretending to be someone you’re not. Usually, that’s about faking &lt;strong&gt;identity&lt;/strong&gt;—think unauthorized login attempts, session impersonation, or token theft.&lt;/p&gt;
&lt;p&gt;It doesn’t have to be high-tech. A weak password policy or default admin credentials are all it takes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Real-World Relevance&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Credential stuffing from leaked password dumps&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Session hijacking via stolen cookies&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Social engineering leading to unauthorized access&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Chat App Example&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Attacker tries logging in as &lt;code&gt;admin&lt;/code&gt; with common passwords like &lt;code&gt;admin123&lt;/code&gt; or &lt;code&gt;password&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Registration page reveals whether a username/email is already taken (“User already exists”) → helps confirm valid accounts.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Session cookie doesn’t use &lt;code&gt;HttpOnly&lt;/code&gt; or &lt;code&gt;Secure&lt;/code&gt; flags → attacker injects JS and steals session.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;How to Detect&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Unusual login attempts across many usernames&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Brute-force behavior from single IPs&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Session reuse across different IPs/devices&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;How to Prevent&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Enforce strong passwords and rate-limiting&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Use MFA (seriously, just do it)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Harden sessions: &lt;code&gt;HttpOnly&lt;/code&gt;, &lt;code&gt;Secure&lt;/code&gt;, &lt;code&gt;SameSite=Strict&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Generic error messages (“Login failed”) to prevent enumeration&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Alert on login anomalies (e.g., geolocation or timing mismatches)&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h3&gt;T – Tampering with Data&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;What It Means&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Tampering is about &lt;strong&gt;unauthorized modification of data&lt;/strong&gt;—altering messages, modifying user roles, or injecting parameters to mess with system behavior.&lt;/p&gt;
&lt;p&gt;This could be at-rest (modifying DB records), in-transit (MITM), or even through insecure APIs.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Real-World Relevance&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Changing prices on e-commerce sites&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Modifying permissions via API injection&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Overwriting user data via insecure endpoints&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Chat App Example&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;User crafts a &lt;code&gt;PUT /api/messages/1234&lt;/code&gt; call to edit &lt;strong&gt;someone else’s&lt;/strong&gt; message&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Sends chat message with embedded &lt;code&gt;&amp;lt;script&amp;gt;&lt;/code&gt; tag to execute JS on recipient’s browser&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Manually edits session data in localStorage to escalate role from &lt;code&gt;user&lt;/code&gt; to &lt;code&gt;admin&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;How to Detect&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Unexpected mutations in data logs&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Sudden role changes or message edits by unauthorized users&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Parameter tampering attempts in logs (via WAF or API gateway)&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;How to Prevent&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Use digital signatures or hash checks for message integrity (e.g., HMAC)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Implement strict authorization checks at the &lt;strong&gt;server&lt;/strong&gt;, not just UI&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Sanitize inputs (yes, again—this never gets old)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Disable client-side trust for roles or permissions&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h3&gt;R – Repudiation&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;What It Means&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Repudiation is when an attacker performs actions &lt;strong&gt;without accountability&lt;/strong&gt;—then denies them. If the system doesn’t log properly, they get away clean.&lt;/p&gt;
&lt;p&gt;It’s like someone deleting all your Slack messages and saying, “Wasn’t me.”&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Real-World Relevance&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Lack of logs in cloud misconfigurations&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Insider threats covering their tracks&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Attackers disabling or deleting logs post-compromise&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Chat App Example&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;User deletes messages with no audit trail—no record of who said what&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Admin bans a user but there’s no timestamp or log of that action&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;A rogue employee reads DMs and no one knows because access wasn&apos;t logged&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;How to Detect&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;You can’t… unless you already had good logging in place&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Look for missing data in activity logs&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Use external systems (like SIEM) to detect deletions or gaps&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;How to Prevent&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Immutable logging (e.g., append-only logs with checksum verification)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Log all sensitive actions: logins, deletions, permission changes&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Store logs off-host (e.g., centralized logging with ELK or Loki)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Use timestamping + user context in every log event&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h3&gt;I – Information Disclosure&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;What It Means&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;This is about &lt;strong&gt;leaking data&lt;/strong&gt; to unauthorized users. Could be PII, secrets, internal APIs, or even error messages that give away the goods.&lt;/p&gt;
&lt;p&gt;It doesn’t need to be a SQL injection. Sometimes it’s just poorly scoped permissions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Real-World Relevance&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Exposed S3 buckets&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Leaky APIs showing internal user info&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Stack traces returned in production&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Chat App Example&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;User accesses &lt;code&gt;/api/messages?id=4001&lt;/code&gt; and gets &lt;strong&gt;another user’s message&lt;/strong&gt; because there&apos;s no ownership check&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;API returns full user records including email and IPs&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Error page reveals server paths or tech stack via verbose stack trace&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;How to Detect&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Data leak detection in outbound logs&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;DLP (Data Loss Prevention) tools for sensitive data patterns&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Review of access control on all endpoints and APIs&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;How to Prevent&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Apply object-level access controls (don’t trust “just the route”)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Strip metadata from responses&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Mask sensitive data (e.g., show part of an email, not all)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Disable detailed errors in production&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h3&gt;D – Denial of Service (DoS)&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;What It Means&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;DoS means making a system &lt;strong&gt;unavailable or unusable&lt;/strong&gt;—intentionally or accidentally—usually by overwhelming it.&lt;/p&gt;
&lt;p&gt;This isn’t just about traffic floods. It includes logic bombs, resource exhaustion, and malformed input that crashes the app.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Real-World Relevance&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Spamming forms or chat endpoints&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Flooding chat with emojis or large payloads&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Abuse of nested JSON to crash parsers&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Chat App Example&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Bot sends thousands of messages per second → server CPU maxes out&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Large message payloads (10MB+ text blobs) crash DB or front-end&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Abuse of emoji reactions to spam notifications&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;How to Detect&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Rate spikes on endpoints (monitor RPS/latency)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Alerts for memory, CPU, or queue overflows&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;App crashes tied to malformed input&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;How to Prevent&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Rate limiting per IP/token/user (e.g., using Redis buckets)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Set max body size on requests&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Queue-based processing (isolate spikes from core logic)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;CAPTCHAs on forms and registration&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h3&gt;E – Elevation of Privilege&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;What It Means&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;This one’s the crown jewel of attacks. EoP is when a normal user gains &lt;strong&gt;higher privileges&lt;/strong&gt;—like becoming an admin, impersonating other users, or accessing restricted areas.&lt;/p&gt;
&lt;p&gt;This often comes from &lt;strong&gt;missing authorization checks&lt;/strong&gt; or client-side trust.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Real-World Relevance&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;IDOR (Insecure Direct Object Reference)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Hidden admin features discovered by poking URLs&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;JWT manipulation (changing &lt;code&gt;role: user&lt;/code&gt; → &lt;code&gt;role: admin&lt;/code&gt;)&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Chat App Example&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Regular user discovers &lt;code&gt;/admin/users&lt;/code&gt; route and sees admin dashboard&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;API lets any authenticated user call &lt;code&gt;DELETE /users/{id}&lt;/code&gt; without role check&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;JWT token is unsigned or uses symmetric secret → attacker creates valid “admin” token&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;How to Detect&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Auth bypass attempts in logs&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Use of admin-only routes by regular users&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Role mismatches between session vs behavior&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;How to Prevent&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Enforce role-based access on the backend (never rely on frontend auth)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Use signed JWTs with asymmetric encryption (RS256 &amp;gt; HS256)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Scope tokens tightly (expiration, audience, permissions)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Always fail securely—deny access by default&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;hr /&gt;
&lt;h2&gt;Interesting STRIDE Facts (Because Nerding Out Is Fun)&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Mnemonic Origins:&lt;/strong&gt; STRIDE threat modeling is a &lt;strong&gt;backronym&lt;/strong&gt;—it was created to map common threat types to the core properties of secure systems (authentication, integrity, non-repudiation, confidentiality, availability, and authorization).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&quot;D&quot; Is Sneaky:&lt;/strong&gt; Denial of Service in STRIDE isn’t always massive traffic floods. It includes logic bombs and resource starvation too. Your app doesn’t have to go down in flames to be considered under DoS threat.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;STRIDE Is Not a Checklist:&lt;/strong&gt; It’s a &lt;strong&gt;thinking framework&lt;/strong&gt;, not a compliance sheet. The real power is in using it to uncover flaws in the design &lt;em&gt;before&lt;/em&gt; they hit production.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;STRIDE + DFD = 🔥&lt;/strong&gt;: It’s most effective when paired with &lt;strong&gt;data flow diagrams&lt;/strong&gt; (DFDs). You model how data flows through your app, then apply STRIDE to each element (data store, process, external entity, etc.).&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;STRIDE’s Secret Superpower&lt;/h2&gt;
&lt;p&gt;Most threat modeling frameworks require heavy lifting or lots of training. STRIDE hits that sweet spot: &lt;strong&gt;easy enough for a dev to use, powerful enough for a security architect to trust&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The beauty? You can use it on &lt;strong&gt;anything&lt;/strong&gt;—from a serverless app to a Kubernetes cluster to, yep, our friendly chat app.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;If you’re building features faster than you’re threat modeling, you’re building features that might become attack surfaces. STRIDE slows you down just enough to build wisely.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;Final Thoughts on STRIDE Threat Modeling&lt;/h2&gt;
&lt;p&gt;STRIDE threat modeling isn’t just academic. It’s a conversation starter. A design reviewer. A build-time bodyguard.&lt;/p&gt;
&lt;p&gt;Every time you launch a new feature or review a pull request, ask:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;S&lt;/strong&gt; – Could someone fake an identity here?&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;T&lt;/strong&gt; – Can they change something they shouldn’t?&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;R&lt;/strong&gt; – Will we know who did what?&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;I&lt;/strong&gt; – Are we leaking anything useful?&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;D&lt;/strong&gt; – Can someone take this down with a hammer?&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;E&lt;/strong&gt; – What happens if a normal user pushes the limits?&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Apply this mindset to each part of your system—&lt;strong&gt;auth&lt;/strong&gt;, &lt;strong&gt;storage&lt;/strong&gt;, &lt;strong&gt;API&lt;/strong&gt;, &lt;strong&gt;UI&lt;/strong&gt;, &lt;strong&gt;admin tools&lt;/strong&gt;, and even &lt;strong&gt;logs&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Your future self (and your customers) will thank you.&lt;/p&gt;
&lt;p&gt;More references:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;a href=&quot;https://owasp.org/www-community/Threat_Modeling_Process&quot;&gt;OWASP Threat Modeling Process&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;:::info
Like the post? You&apos;re welcome to check out my other posts:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;/posts/a-deep-dive-into-http-basic-authentication&quot;&gt;A Deep Dive into HTTP Basic Authentication&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;/posts/password-authentication-in-node-js-a-step-by-step-guide&quot;&gt;Password Authentication in Node.js: A Step-by-Step Guide&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;/posts/unlocking-web-security-master-jwt-authentication&quot;&gt;Unlocking Web Security: Master JWT Authentication&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;http://localhost:4321/tags/cybersec&quot;&gt;CyberSecurity&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;:::&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Agentic Frameworks: A Quick Guide to the 2025 Agent War</title><link>https://geekcoding101.com/posts/a-quick-guide</link><guid isPermaLink="true">https://geekcoding101.com/posts/a-quick-guide</guid><pubDate>Fri, 14 Nov 2025 00:00:00 GMT</pubDate><content:encoded>&lt;h1&gt;Introduction to the Agentic Framework Series&lt;/h1&gt;
&lt;p&gt;Lately it feels like the world of AI moves so fast that if you blink, you miss an entire generation of breakthroughs. As someone who &lt;em&gt;loves&lt;/em&gt; digging into emerging technologies, I finally gathered the courage to steal a little time from a busy schedule and kick off a series I’ve been wanting to write for ages. And what better place to start than the booming world of &lt;strong&gt;agentic frameworks&lt;/strong&gt;? With LangGraph, LlamaIndex Agents, OpenAI’s Agents SDK, Google’s ADK, and Microsoft’s Agent Framework all evolving at lightning speed, the &lt;strong&gt;agentic AI&lt;/strong&gt; ecosystem is turning into a full-on 2025 “Agent War.” Since I’m constantly tracking these updates anyway, I figured—why not share the journey and explore this rapidly shifting landscape together?&lt;/p&gt;
&lt;h2&gt;&lt;strong&gt;What Is an Agentic Framework?&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;An &lt;strong&gt;agentic framework&lt;/strong&gt; is a software toolkit designed to help developers build AI agents—systems that can reason, take actions, use tools, and complete multi-step tasks with a level of autonomy. Instead of treating an LLM as a “single prompt in, single answer out” model, an agentic framework gives it structure: memory, tools, workflows, decision loops, and the ability to interact with data or external systems. Frameworks like LangGraph, LlamaIndex Agents, OpenAI’s Agent SDK, and Google’s ADK make it possible to create agents that can research, retrieve information, write code, operate APIs, and even coordinate with other agents. In short, an agentic framework transforms a passive model into an active problem-solver.&lt;/p&gt;
&lt;p&gt;BTW, &lt;strong&gt;agentic frameworks&lt;/strong&gt; — often called &lt;em&gt;AI agent frameworks&lt;/em&gt; as well.&lt;/p&gt;
&lt;p&gt;Enough talk — let’s get to it!&lt;/p&gt;
&lt;h1&gt;Let&apos;s Summon Popular Agentic Frameworks&lt;/h1&gt;
&lt;h2&gt;LangChain + LangGraph&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Positioning:&lt;/strong&gt;&lt;br /&gt;
Probably still the most widely recognized OSS agent stack; LangChain for the agent loop and tools, LangGraph for durable, stateful workflows. Link: &lt;a href=&quot;https://blog.langchain.com/langchain-langgraph-1dot0/?utm_source=chatgpt.com&quot;&gt;LangChain Blog&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Latest status (late 2025):&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;LangChain and LangGraph both hit v1.0 with a tightened “core agent loop” and a middleware system for flexible control.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;LangGraph adds graph-structured workflows, persistence, debugging, and visual tools for complex agent interactions and long-running processes.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt;&lt;br /&gt;
General-purpose agent apps: RAG copilots, workflow agents, multi-step tool use, where you want a large ecosystem and lots of examples.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;LlamaIndex (LlamaAgents &amp;amp; Workflows)&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Positioning:&lt;/strong&gt;&lt;br /&gt;
Started as a data/RAG library, now a full “data-centric agent framework” with strong connectors, parsing (LlamaParse), and high-quality RAG pipelines.Links: &lt;a href=&quot;https://www.llamaindex.ai/workflows?utm_source=chatgpt.com&quot;&gt;LlamaIndex+2&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Latest status:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;LlamaAgents early-access: full-stack templates for building agents, including TypeScript workflows and CopilotKit integrations.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;2025 comparisons show improved retrieval quality and strong performance for document-heavy workloads.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt;&lt;br /&gt;
Knowledge-intensive agents: enterprise search copilots, contract analysis, tech docs assistants, any system where the “data plane” is the hard part.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;OpenAI Agents SDK (successor to Swarm)&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Positioning:&lt;/strong&gt;&lt;br /&gt;
OpenAI’s first-party agentic framework for building agents over their Responses API: a minimal set of primitives (agents, handoffs, guardrails, sessions) with tight GPT integration. Link: &lt;a href=&quot;https://openai.github.io/openai-agents-python/?utm_source=chatgpt.com&quot;&gt;OpenAI Agents SDK&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Latest status:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Designed as a production-ready agentic framework of the earlier “Swarm” multi-agent experiment.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Integrated with OpenAI’s new Responses API (web search, computer use, document search), replacing the older Assistants API over 2025-26.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt;&lt;br /&gt;
Teams already standardized on OpenAI: quick path to agents with web search, tools, and multi-agent delegation without heavy orchestration code.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Microsoft Agent Framework (AutoGen + Semantic Kernel)&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Positioning:&lt;/strong&gt;&lt;br /&gt;
Microsoft is merging AutoGen’s multi-agent orchestration with Semantic Kernel’s enterprise integration into a single “Microsoft Agent Framework” for Python and .NET. Link: &lt;a href=&quot;https://github.com/microsoft/agent-framework?utm_source=chatgpt.com&quot;&gt;Microsoft Agent Framework&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Latest status:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;AutoGen is now in maintenance mode; new feature development is happening in Microsoft Agent Framework.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Provides agents, planners, and orchestration with hooks into Azure/OpenAI, Office 365, and other Microsoft services.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt;&lt;br /&gt;
Enterprise .NET/Python shops on Azure that want multi-agent workflows tied into existing Microsoft infra, identity, and DevOps.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Google Agent Development Kit (ADK) / Vertex AI Agent Builder&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Positioning:&lt;/strong&gt;&lt;br /&gt;
Google’s open-source Agent Development Kit plus Vertex AI Agent Builder: ADK for local/dev usage, Agent Builder for managed, scalable deployment. Link: &lt;a href=&quot;https://cloud.google.com/products/agent-builder?utm_source=chatgpt.com&quot;&gt;Google Agent Build&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Latest status:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;ADK announced in 2025 as a modular, model-agnostic agentic framework (though optimized for Gemini and Google Cloud).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Recent updates add prebuilt plugins (including “self-heal”), more language support (Go, Python, Java), and richer observability &amp;amp; security for production agents.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt;&lt;br /&gt;
GCP-centric teams: data &amp;amp; MLOps agents (BigQuery, Dataflow, etc.), multi-system enterprise agents, and workloads that need Vertex AI governance features.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;CrewAI&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Positioning:&lt;/strong&gt;&lt;br /&gt;
A popular OSS “multi-agent crew” agentic framework: define specialized agents (roles), share context, and let them collaborate on tasks. Link: &lt;a href=&quot;https://www.crewai.com/?utm_source=chatgpt.com&quot;&gt;CrewAI&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;:::info
Sneak peek — the next episode will likely be a hands-on dive into CrewAI!
:::&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Latest status:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Active development and strong marketing as a “multi-agent platform” with business-oriented workflows.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Frequently cited (alongside LangChain &amp;amp; AutoGen) as a top choice in 2025 industry roundups because of its clean Python API and real-world focus.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt;&lt;br /&gt;
Multi-agent experiments and startup-style stacks: e.g., “researcher + planner + executor” teams for content, lead-gen, or coding tasks.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Haystack&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Positioning:&lt;/strong&gt;&lt;br /&gt;
RAG + agentic framework aimed squarely at production: modular pipelines and “agents” that can call tools, retrieve data, and generate answers.Link: &lt;a href=&quot;https://docs.haystack.deepset.ai/docs/intro?utm_source=chatgpt.com&quot;&gt;Heystack&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Latest status:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Marketed specifically for “agentic, compound AI systems” with end-to-end observability and debugging.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Supports agents that choose between web search, vector stores, and other tools to resolve complex queries.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt;&lt;br /&gt;
Teams that want a transparent, production-grade RAG/agent stack with strong search roots (Elastic/OpenSearch, vector DBs) and clear pipelines.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;OpenHands Software Agent SDK (software-dev agents)&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Positioning:&lt;/strong&gt;&lt;br /&gt;
A toolkit spun out of the popular OpenHands code-assistant framework, specifically for reliable software development agents (coding, debugging, PRs). Link: &lt;a href=&quot;https://arxiv.org/abs/2511.03690?utm_source=chatgpt.com&quot;&gt;OpenHands Software Agent SDK&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Latest status:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;2025 paper describes a redesigned SDK for flexible, secure software agents: sandboxed execution, multi-LLM routing, and integration with editors (VS Code, browser, CLI).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt;&lt;br /&gt;
Engineering teams wanting &lt;em&gt;production&lt;/em&gt; code agents (e.g., SWE-Bench style tasks) with strong execution sandboxes and lifecycle controls.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Research / Training-oriented frameworks (Agent Lightning, etc.)&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Positioning:&lt;/strong&gt;&lt;br /&gt;
These are less “app frameworks” and more &lt;em&gt;training&lt;/em&gt; stacks, but relevant for anyone planning RL-fine-tuned agents. The paper link is &lt;a href=&quot;https://arxiv.org/abs/2511.03690?utm_source=chatgpt.com&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Agent Lightning&lt;/strong&gt;: RL training framework that decouples training from agent execution, plugging into existing agent stacks like LangChain, AutoGen, OpenAI Agents SDK with minimal changes.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt;&lt;br /&gt;
R&amp;amp;D teams working on agent RL, evaluation, and fine-tuning rather than pure orchestration.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;Dominant Use Cases&lt;/h2&gt;
&lt;p&gt;Across vendors and OSS, you see a few consistent patterns:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;RAG copilots / knowledge assistants&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Internal “Chat with docs” for policies, support docs, codebases (LangChain, LlamaIndex, Haystack).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Industry: customer support, legal analysis, technical documentation.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Agentic Process Automation (APA) / workflow agents&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Multi-step agents that call APIs, write files, trigger workflows, etc. (LangGraph, CrewAI, OpenAI Agents SDK, MS Agent Framework, Google ADK). Link: &lt;a href=&quot;https://www.ampcome.com/post/top-7-ai-agent-frameworks-in-2025&quot;&gt;Top 7 AI Agentic Frameworks in 2025: The Ultimate Guide.&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Example: end-to-end lead processing, back-office ops, report generation.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Software engineering agents&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Coding, refactoring, test-generation, PR review (OpenHands SDK; OpenAI computer-use agents; some AutoGen/CrewAI patterns).&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Data &amp;amp; analytics agents&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Data engineering/data science agents in the cloud (&lt;a href=&quot;https://www.androidcentral.com/apps-software/ai/google-cloud-is-adding-six-new-ai-agents-for-devs-scientists-and-power-users?utm_source=chatgpt.com&quot;&gt;Google’s Data Engineering Agent, Data Science Agent; ADK/Vertex AI&lt;/a&gt;).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;SQL-query agents, interactive dashboards, ETL automation.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Ops / infra &amp;amp; enterprise workflows&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Cloud management, monitoring, and remediation (&lt;a href=&quot;https://www.androidcentral.com/apps-software/ai/google-cloud-is-adding-six-new-ai-agents-for-devs-scientists-and-power-users?utm_source=chatgpt.com&quot;&gt;GCP’s new ops agents, Azure/AKS workflows, security &amp;amp; governance hooks&lt;/a&gt;).&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;hr /&gt;
&lt;h1&gt;What is the best AI agent framework ?!&lt;/h1&gt;
&lt;p&gt;Haha! I know you will ask this question!&lt;/p&gt;
&lt;p&gt;Choosing the best AI agent framework isn’t as simple as crowning a single winner—because the “best” depends entirely on what you’re building. &lt;strong&gt;LangGraph&lt;/strong&gt; excels at complex, stateful workflows with fine-grained control. &lt;strong&gt;LlamaIndex&lt;/strong&gt; dominates when your agent needs powerful data retrieval and document intelligence. &lt;strong&gt;OpenAI’s Agents SDK&lt;/strong&gt; is unbeatable for rapid development with built-in web search, tool use, and multi-agent orchestration. &lt;strong&gt;Google’s ADK&lt;/strong&gt; shines in enterprise environments that lean heavily on GCP data pipelines. And &lt;strong&gt;Microsoft’s Agent Framework&lt;/strong&gt; integrates seamlessly with Azure and the broader Microsoft ecosystem. Instead of looking for a universal champion, the smarter question is: &lt;em&gt;Which agentic framework aligns with your stack, your data, and your use case?&lt;/em&gt; Because in 2025’s Agent War, &quot;context&quot;—not hype—decides the winner.&lt;/p&gt;
&lt;h1&gt;Where we can go deeper next ?&lt;/h1&gt;
&lt;p&gt;We’ve only scratched the surface of what the agentic ecosystem is becoming. From production-ready orchestration with LangGraph, to data-centric workflows in LlamaIndex, to OpenAI’s emerging Agent SDK and Google’s ADK reshaping enterprise automation—the real excitement starts when we dig into how these agentic frameworks actually behave in the wild.&lt;/p&gt;
&lt;p&gt;In the next episodes, I&apos;d like to break down architectures, explore real-world use cases, and walk through hands-on builds you can follow step by step. If the 2025 “Agent War” or &quot;Agentic Framework War&quot; or whatever has your curiosity sparked, stick around—this is just the beginning, and the most interesting battles are still ahead.&lt;/p&gt;
&lt;p&gt;:::info
You&apos;re on a roll! Don&apos;t stop now—check out the other series and level up your AI skills. Make sure you haven&apos;t missed anything! 👇&lt;/p&gt;
&lt;p&gt;🚀 &lt;a href=&quot;/tags/daily-ai-insights&quot;&gt;Daily AI Insights&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;🚀 &lt;a href=&quot;/tags/machine-learning&quot;&gt;Machine Learning&lt;/a&gt;
:::&lt;/p&gt;
</content:encoded><author>GeekCoding101</author></item><item><title>Understanding OpenDAL Storage in Dify: A New Year&apos;s Journey</title><link>https://geekcoding101.com/posts/dify-01-opendal-storage-configuration</link><guid isPermaLink="true">https://geekcoding101.com/posts/dify-01-opendal-storage-configuration</guid><pubDate>Thu, 01 Jan 2026 00:00:00 GMT</pubDate><content:encoded>&lt;h1&gt;Understanding OpenDAL Storage in Dify: A New Year&apos;s Journey&lt;/h1&gt;
&lt;h2&gt;My Story&lt;/h2&gt;
&lt;p&gt;Happy New Year! 🎉&lt;/p&gt;
&lt;p&gt;As 2026 kicks off, I countinue to dive deeper into the AI ecosystem and experimenting with various tools. My focus? Building practical AI workflows with platforms like &lt;a href=&quot;https://dify.ai&quot;&gt;Dify&lt;/a&gt;, &lt;a href=&quot;https://n8n.io&quot;&gt;n8n&lt;/a&gt;, &lt;a href=&quot;https://openrouter.ai&quot;&gt;OpenRouter&lt;/a&gt; and so on. This post documents one of my first with Dify - figuring out how its file storage actually works.&lt;/p&gt;
&lt;p&gt;If you&apos;re like me and got confused about where &lt;a href=&quot;https://dify.ai&quot;&gt;Dify&lt;/a&gt; stores files, why containers keep restarting with cryptic permission errors, or what the heck &lt;code&gt;OPENDAL_FS_ROOT&lt;/code&gt; actually does, this guide is for you.&lt;/p&gt;
&lt;h2&gt;What is This About?&lt;/h2&gt;
&lt;p&gt;I use self-hosted &lt;a href=&quot;https://dify.ai&quot;&gt;Dify&lt;/a&gt; and run it with docker. I found that the file storage configuration is not well documented, so I decided to write this post to help others understand how it works.&lt;/p&gt;
&lt;p&gt;When you&apos;re running &lt;a href=&quot;https://dify.ai&quot;&gt;Dify&lt;/a&gt; with Docker, understanding how file storage works is crucial. Files uploaded to &lt;a href=&quot;https://dify.ai&quot;&gt;Dify&lt;/a&gt; (documents, images, etc.) need to be stored somewhere, and that &quot;somewhere&quot; involves a dance between environment variables, Docker volume mounts, and a library called &lt;a href=&quot;https://github.com/apache/opendal&quot;&gt;OpenDAL&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Let me break down what I learned the hard way.&lt;/p&gt;
&lt;h2&gt;The Key Players&lt;/h2&gt;
&lt;h3&gt;1. OpenDAL - The Unsung Hero&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href=&quot;https://github.com/apache/opendal&quot;&gt;OpenDAL&lt;/a&gt;&lt;/strong&gt; (Apache Open Data Access Layer) is basically a Swiss Army knife for storage. It gives you one consistent API to talk to different storage backends - local filesystem, AWS S3, Azure Blob, you name it. Think of it as a translator that speaks &quot;storage&quot; in many dialects.&lt;/p&gt;
&lt;h3&gt;2. Environment Variables&lt;/h3&gt;
&lt;p&gt;Your &lt;code&gt;.env&lt;/code&gt; file is where the magic configuration happens. This is where you tell Dify how and where to store files.&lt;/p&gt;
&lt;h3&gt;3. Docker Volume Mounts&lt;/h3&gt;
&lt;p&gt;This is the bridge between your Mac (or whatever host you&apos;re on) and the Docker container&apos;s internal filesystem. Get this wrong, and you&apos;ll be scratching your head for hours. Trust me, I know.&lt;/p&gt;
&lt;h2&gt;How The Pieces Fit Together&lt;/h2&gt;
&lt;h3&gt;Step 1: The Environment Variable Magic&lt;/h3&gt;
&lt;p&gt;When you set &lt;code&gt;OPENDAL_SCHEME=fs&lt;/code&gt;, the system starts looking for variables that match this pattern:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;OPENDAL_&amp;lt;SCHEME_NAME&amp;gt;_&amp;lt;CONFIG_NAME&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So for filesystem storage:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;OPENDAL_SCHEME=fs&lt;/code&gt; → tells it to use local filesystem&lt;/li&gt;
&lt;li&gt;&lt;code&gt;OPENDAL_FS_ROOT=&amp;lt;path&amp;gt;&lt;/code&gt; → tells it where to put files&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Simple enough, right? Well, here&apos;s where it gets interesting...&lt;/p&gt;
&lt;h3&gt;Step 2: What Happens Inside the Container&lt;/h3&gt;
&lt;p&gt;I dove into the source code (&lt;code&gt;api/extensions/storage/opendal_storage.py&lt;/code&gt;) and found this gem:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;def _get_opendal_kwargs(*, scheme: str, env_file_path: str = &quot;.env&quot;, prefix: str = &quot;OPENDAL_&quot;):
    kwargs = {}
    config_prefix = prefix + scheme.upper() + &quot;_&quot;  # Creates &quot;OPENDAL_FS_&quot;
    
    # Scans environment variables
    for key, value in os.environ.items():
        if key.startswith(config_prefix):
            kwargs[key[len(config_prefix):].lower()] = value
    # OPENDAL_FS_ROOT becomes kwargs[&apos;root&apos;]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then in the OpenDALStorage constructor:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;def __init__(self, scheme: str, **kwargs):
    kwargs = kwargs or _get_opendal_kwargs(scheme=scheme)
    
    if scheme == &quot;fs&quot;:
        root = kwargs.get(&quot;root&quot;, &quot;storage&quot;)  # Gets OPENDAL_FS_ROOT value
        Path(root).mkdir(parents=True, exist_ok=True)  # Creates directory inside container
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Step 3: The Critical Connection - Volume Mounts&lt;/h3&gt;
&lt;p&gt;Here&apos;s what tripped me up initially. In &lt;code&gt;docker-compose.yaml&lt;/code&gt;, you&apos;ll see:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;volumes:
  - ./volumes/app/storage:/app/api/storage
    # ^^^^^^^^^^^^^^^^^^^^  ^^^^^^^^^^^^^^^^^
    # Your Mac              Inside container
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;The &quot;Aha!&quot; Moment:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Left side&lt;/strong&gt; (&lt;code&gt;./volumes/app/storage&lt;/code&gt;): This is a folder on &lt;strong&gt;your actual Mac&lt;/strong&gt; (relative to where docker-compose.yaml lives)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Right side&lt;/strong&gt; (&lt;code&gt;/app/api/storage&lt;/code&gt;): This is a folder &lt;strong&gt;inside the Docker container&lt;/strong&gt; (a completely separate filesystem)&lt;/li&gt;
&lt;li&gt;Docker magically keeps these two folders in sync&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So when Dify writes a file to &lt;code&gt;/app/api/storage&lt;/code&gt; inside the container, it appears in &lt;code&gt;./volumes/app/storage&lt;/code&gt; on your Mac. Mind = blown. 🤯&lt;/p&gt;
&lt;h2&gt;Real-World Examples&lt;/h2&gt;
&lt;h3&gt;The Default Setup (What Works Out of the Box)&lt;/h3&gt;
&lt;p&gt;This is what Dify gives you by default, and honestly, it works great:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# .env
STORAGE_TYPE=opendal
OPENDAL_SCHEME=fs
OPENDAL_FS_ROOT=/app/api/storage  # Container path - must match volume mount
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;⚠️ Deprecated Configuration (Do Not Use)&lt;/h3&gt;
&lt;p&gt;The following configuration is &lt;strong&gt;deprecated&lt;/strong&gt; and should be migrated to OpenDAL:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# DEPRECATED - Old approach
STORAGE_TYPE=local
STORAGE_LOCAL_PATH=storage
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;Why it&apos;s deprecated:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;STORAGE_TYPE=local&lt;/code&gt; is marked as deprecated in the codebase&lt;/li&gt;
&lt;li&gt;&lt;code&gt;STORAGE_LOCAL_PATH&lt;/code&gt; is deprecated in favor of OpenDAL&apos;s configuration&lt;/li&gt;
&lt;li&gt;OpenDAL provides a unified interface that supports multiple storage backends&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Migration path:&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# Before (Deprecated)
STORAGE_TYPE=local
STORAGE_LOCAL_PATH=storage

# After (Current)
STORAGE_TYPE=opendal
OPENDAL_SCHEME=fs
OPENDAL_FS_ROOT=/app/api/storage
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The OpenDAL approach offers:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Unified configuration pattern across all storage types&lt;/li&gt;
&lt;li&gt;Better extensibility (easy to switch to S3, Azure Blob, etc.)&lt;/li&gt;
&lt;li&gt;Improved error handling and retry mechanisms&lt;/li&gt;
&lt;li&gt;Active maintenance and support&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;# docker-compose.yaml
volumes:
  - ./volumes/app/storage:/app/api/storage
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt; Files stored in &lt;code&gt;./volumes/app/storage/&lt;/code&gt; on your Mac&lt;/p&gt;
&lt;h3&gt;What If I Want Files Somewhere Else?&lt;/h3&gt;
&lt;p&gt;Maybe you&apos;re like me and want all your AI project files in a specific folder. Here&apos;s how:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# .env (NO CHANGE NEEDED)
STORAGE_TYPE=opendal
OPENDAL_SCHEME=fs
OPENDAL_FS_ROOT=/app/api/storage  # Keep this as container path
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;# docker-compose.yaml (ONLY change the left side)
volumes:
  - ~/Documents/models/dify_data/files:/app/api/storage
    # ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^  ^^^^^^^^^^^^^^^^
    # Custom host path                    Same container path
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt; All your Dify files now live in &lt;code&gt;~/Documents/models/dify_data/files/&lt;/code&gt; on your Mac. Perfect for keeping your AI experiments organized!&lt;/p&gt;
&lt;h2&gt;Mistakes I Made (So You Don&apos;t Have To)&lt;/h2&gt;
&lt;h3&gt;🤦 Mistake #1: The $HOME Trap&lt;/h3&gt;
&lt;p&gt;I thought &quot;Hey, I&apos;ll just use &lt;code&gt;$HOME&lt;/code&gt; to point to my Documents folder!&quot;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# DON&apos;T DO THIS - I learned the hard way
OPENDAL_FS_ROOT=$HOME/Documents/models/dify_data/files
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;What happened:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;My containers started crash-looping&lt;/li&gt;
&lt;li&gt;Error logs screamed: &lt;code&gt;PermissionError: [Errno 13] Permission denied: &apos;/Users&apos;&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Spent 2 hours debugging 😅&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Why it failed:&lt;/strong&gt;
Inside the Docker container, &lt;code&gt;$HOME&lt;/code&gt; is &lt;code&gt;/root&lt;/code&gt;, not &lt;code&gt;/Users/yourusername&lt;/code&gt;. The container tried to create &lt;code&gt;/Users/yourusername/Documents/...&lt;/code&gt; and failed spectacularly.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# Always use the container path
OPENDAL_FS_ROOT=/app/api/storage
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;🤦 Mistake #2: Trying to Be Too Clever&lt;/h3&gt;
&lt;p&gt;When I wanted a custom storage location, my first instinct was to change &lt;code&gt;OPENDAL_FS_ROOT&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# NOPE - This breaks everything
OPENDAL_FS_ROOT=/my/custom/path
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;The problem:&lt;/strong&gt; This path must match the &lt;strong&gt;right side&lt;/strong&gt; of your volume mount. If they don&apos;t match, chaos ensues.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The right way:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Keep &lt;code&gt;OPENDAL_FS_ROOT=/app/api/storage&lt;/code&gt; (container path)&lt;/li&gt;
&lt;li&gt;Only change the &lt;strong&gt;left side&lt;/strong&gt; of the volume mount (your Mac path)&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;🤦 Mistake #3: Relative Path Confusion&lt;/h3&gt;
&lt;p&gt;Using relative paths seemed harmless:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# Avoid this
OPENDAL_FS_ROOT=storage  # Where even is this?
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;The issue:&lt;/strong&gt; Inside a container, &quot;relative to what?&quot; becomes a real question. Is it relative to &lt;code&gt;/app&lt;/code&gt;? &lt;code&gt;/app/api&lt;/code&gt;? Who knows!&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Better approach:&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# Crystal clear - no ambiguity
OPENDAL_FS_ROOT=/app/api/storage
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;When Things Go Wrong (Debugging Tips)&lt;/h2&gt;
&lt;h3&gt;Symptom: Containers Keep Restarting&lt;/h3&gt;
&lt;p&gt;This was my first encounter with Dify. Everything would start, then crash, start again, crash again. Fun times.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;First, check the logs:&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;docker-compose logs --tail=50 api
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;If you see this:&lt;/strong&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;PermissionError: [Errno 13] Permission denied: &apos;/Users&apos;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You might have got the &lt;code&gt;$HOME&lt;/code&gt; trap or a path mismatch. Fix &lt;code&gt;OPENDAL_FS_ROOT&lt;/code&gt; to use the container path.&lt;/p&gt;
&lt;h3&gt;My Debugging Checklist&lt;/h3&gt;
&lt;p&gt;When something&apos;s off, I run through these commands:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Check environment variable inside container:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;docker exec docker-api-1 env | grep OPENDAL
&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Check mounted directory:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;docker exec docker-api-1 ls -la /app/api/storage
&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Verify sync with host:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;ls -la ./volumes/app/storage/
&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;Quick Reference (The TL;DR)&lt;/h2&gt;
&lt;p&gt;Here&apos;s everything in one place:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;What&lt;/th&gt;
&lt;th&gt;Why It Matters&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;STORAGE_TYPE&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Which storage system to use&lt;/td&gt;
&lt;td&gt;&lt;code&gt;opendal&lt;/code&gt; ✅ (&lt;s&gt;&lt;code&gt;local&lt;/code&gt; is old news&lt;/s&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;OPENDAL_SCHEME&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;What kind of storage&lt;/td&gt;
&lt;td&gt;&lt;code&gt;fs&lt;/code&gt; for local files&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;OPENDAL_FS_ROOT&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Where files go &lt;strong&gt;in the container&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/app/api/storage&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;s&gt;&lt;code&gt;STORAGE_LOCAL_PATH&lt;/code&gt;&lt;/s&gt;&lt;/td&gt;
&lt;td&gt;&lt;s&gt;Old way of doing things&lt;/s&gt;&lt;/td&gt;
&lt;td&gt;&lt;s&gt;Use OpenDAL instead&lt;/s&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Volume mount (left)&lt;/td&gt;
&lt;td&gt;Where files appear &lt;strong&gt;on your Mac&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;./volumes/app/storage&lt;/code&gt; or custom path&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Volume mount (right)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Must match&lt;/strong&gt; &lt;code&gt;OPENDAL_FS_ROOT&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/app/api/storage&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;The Golden Rules:&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;code&gt;OPENDAL_FS_ROOT&lt;/code&gt; always points to a container path (right side of volume mount)&lt;/li&gt;
&lt;li&gt;Want files elsewhere on your Mac? Change only the left side of the volume mount&lt;/li&gt;
&lt;li&gt;Never use &lt;code&gt;$HOME&lt;/code&gt; or Mac paths in &lt;code&gt;OPENDAL_FS_ROOT&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;When in doubt, use absolute paths&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;Wrapping Up&lt;/h2&gt;
&lt;p&gt;This was just one piece of my AI experimentation journey. As I continue exploring Dify, n8n, and the broader AI ecosystem in 2026, I&apos;m sure I&apos;ll encounter more quirks and learning moments. That&apos;s the fun part, right?&lt;/p&gt;
&lt;p&gt;If you found this helpful or have your own Dify war stories, I&apos;d love to hear them! This AI revolution is moving fast, and we&apos;re all learning together.&lt;/p&gt;
&lt;p&gt;Happy building! 🚀&lt;/p&gt;
&lt;h2&gt;Useful Links&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/apache/opendal&quot;&gt;Apache OpenDAL Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/apache/opendal/tree/main/core/src/services&quot;&gt;OpenDAL Service Configurations&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Dify code: &lt;code&gt;api/extensions/storage/opendal_storage.py&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://geekcoding101.com/posts/a-quick-guide&quot;&gt;Agentic Frameworks: A Quick Guide to the 2025 Agent War&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</content:encoded><author>GeekCoding101</author></item></channel></rss>