requirejs的源码学习（01）——初始化流程

本文链接：https://blog.csdn.net/lengye7/article/details/123718277

前言

现在已经2022年了，大家都已经用上webpack来进行各种打包了，webpack也能很好的兼容各种模块化方案，但是requirejs这个曾经很流行的模块化方案还是值得学习一下的。

本次学习的目的并不是为了弄清楚requirejs的方方面面，而是为了弄清楚其模块化加载原理。

注意：本文只探讨浏览器环境下的requirejs。

程序入口

本文分析的requirejs版本：2.3.6。

本人菜鸡一个，直接阅读源码，显然不太可能，我采用F12动态调试的办法来追踪它的运行流程。

<!--我在这里并不使用data-main方式，具体原因参考官网-->
<script src="require.js" type="text/javascript"></script>
<script>
        require.config({
            baseUrl:"js",
            paths:{
                app:"./app",
            }
        });
        require(["app"],function(){
            console.log("This is index.html,require app success!");
        });
</script>

先来看第一行代码：

这是立即执行函数的原型，有两个参数一个是global，一个是setTimeout（奇了怪了，为什么要弄一个setTimeout的参数呢？）。它的调用参数如下：

第一个this没有质疑的，全局环境下this指向global object。

第二个参数是一个表达式，用来计算sitTimeout是否定义了。

这里回答一下为什么要在函数原型中声明一个setTimeout？

我也不知道什么原因，但是我通过去除这个setTimeout参数，requirejs在浏览器环境下依然可以正常运行。由于requirejs也可以被用于node环境下，我猜想，可能在早期的node环境下setTimeout没有实现，于是乎就有了这个setTimeout。

另外这第一行也直接声明了三个全局变量：

requirejs
require
define

这三个变量，接下来会被初始化为对应的函数，也就是我们使用的API。

接着往下看

闭包环境下初始化了一大堆的变量，其中几个重要的变量：

contexts：所有的上下文都保存在这里（实际上，我通过跟踪，发现在浏览器环境下，就只有一个上下文）。
cfg：后面的配置依靠它来传递，保存data-main的相关配置信息。
globalDefQueue：全局依赖队列，可以认为它是模块依赖加载的数据中转站。
isBrowser：确定是否在浏览器环境下。
defContextName：这是默认的上下文的名字，浏览器环境下只有这一个上下文。

继续往下看

这一段代码应该是为了防止requirejs重复加载或者与其它的AMD规范的实现方案相冲突。

接下来的代码，从199行到1749行全部是function newContext()的定义，这基本是requirejs的核心了，但是它并不在这里立刻执行，我们先跳过不看。

接着继续执行来到：

这里定义了req和requirejs为同一个匿名函数。

到这里，第一个API被定义了requirejs，哈哈。

继续执行：

这一段定义了req（与requirejs指向同一个匿名函数）的各种properties。

其中1822行定义了require=req，全局变量require得到初始化，require这个API被初始化了。

现在，req=requirejs=require。

与此同时require.config也得到了初始化，这个API是非常重要的。

继续往下，遇到了req({})，这是初始化过程中遇到的第一个函数。

# 创建默认上下文req({})

我们跟进去这个函数，看看它具体做了什么。

正如注释所说，这里初始化一个默认上下文，并没有做什么其它的东西。

1790行，这里将获得的默认上下文存储在contexts中，名字为一开始定义的上下文名字，也就是一个_(下划线符号)。

但是newContext()这个函数首次被使用，这是整个requirejs的核心所在，跟进去看看。

由于newContext太长，这里不贴所有代码了。

函数首部做了一些变量的初始化，这些变量就是当前newContext闭包环境所生命的一些变量，其中config是最重要的一个，其保存的都是默认的requirejs的配置信息，而这些变量，最后都会被context对象保存。

然后在，在中间又定义了一系列的内部函数，接着定义了handler与Module对象。

那么接下来，就是真正context定义的地方了。这个定义的内容挺多的，直接截图简要看一下。

可以看到，之前定义的一些变量，全部都被context保存起来了，其中重要的几个properties：

config，Module，nextTick。

在newContext的最后，调用了context.makeRequire()，将context.require初始化为一个其内部的一个函数，localRequire，下面看看localRequire究竟做了什么。

//makeRequire的代码
            makeRequire: function (relMap, options) {
                options = options || {};

                function localRequire(deps, callback, errback) {
                    var id, map, requireMod;

                    if (options.enableBuildCallback && callback && isFunction(callback)) {
                        callback.__requireJsBuild = true;
                    }

                    if (typeof deps === 'string') {
                        if (isFunction(callback)) {
                            //Invalid call
                            return onError(makeError('requireargs', 'Invalid require call'), errback);
                        }

                        //If require|exports|module are requested, get the
                        //value for them from the special handlers. Caveat:
                        //this only works while module is being defined.
                        if (relMap && hasProp(handlers, deps)) {
                            return handlers[deps](registry[relMap.id]);
                        }

                        //Synchronous access to one module. If require.get is
                        //available (as in the Node adapter), prefer that.
                        if (req.get) {
                            return req.get(context, deps, relMap, localRequire);
                        }

                        //Normalize module name, if it contains . or ..
                        map = makeModuleMap(deps, relMap, false, true);
                        id = map.id;

                        if (!hasProp(defined, id)) {
                            return onError(makeError('notloaded', 'Module name "' +
                                        id +
                                        '" has not been loaded yet for context: ' +
                                        contextName +
                                        (relMap ? '' : '. Use require([])')));
                        }
                        return defined[id];
                    }

                    //Grab defines waiting in the global queue.
                    intakeDefines();

                    //Mark all the dependencies as needing to be loaded.
                    context.nextTick(function () {
                        //Some defines could have been added since the
                        //require call, collect them.
                        intakeDefines();

                        requireMod = getModule(makeModuleMap(null, relMap));

                        //Store if map config should be applied to this require
                        //call for dependencies.
                        requireMod.skipMap = options.skipMap;

                        requireMod.init(deps, callback, errback, {
                            enabled: true
                        });

                        checkLoaded();
                    });

                    return localRequire;
                }

实际上，这个localRequire函数实际上是真正载入script的地方，具体它怎么做的，这里不做分析，后续再详细分析。

这里我们得到context.require=localRequire，它是实际载入script的地方，所有的script最后都通过调用它来实现加载。

总结：这个地方真的就是只做了默认的上下文初始化工作，其初始化了requirejs的配置，以及一些保存模块信息的数据结构和这些数据结构对应的方法。

继续执行
define这个API在2061行获得定义。

至此，所有的requirejs的API全部获得了初始化。

继续执行，会再次执行req(cfg)。

我这里没有采用data-main的方式载入，所以这里cfg还是一个空对象，于是这里的执行与上一次req({})没有什么不同，当前上下文的config不会有所改变，也不会有其它的什么操作。

到这里，requirejs初始化完成。

注意：如果使用的data-main作为作为入口，此时就有所不一样了，其中cfg中会包含baseUrl的配置，也包含data-main对应的脚本作为deps，所以这一次执行req(cfg)会将当前上下文的config中的baseUrl改为data-main的所在的路径。这与requirejs文档中的说法是一致的，同时，这里还会加载data-main对应的脚本。

为什么要执行两次req()？

1、第一次执行是为了创建默认的上下文，初始化相关数据结构与工具函数。

2、第二次执行是为了处理data-main的相关的逻辑，将baseUrl设置为data-main的路径，同时最后通过context.require()调用来加载data-main对应的脚本。

requirejs初始化流程简述

1、12-36行初始化闭包环境所使用的一些数据结构

重要的一些数据结构：

contexts：所有的上下文都保存在这里（实际上，我通过跟踪，发现在浏览器环境下，就只有一个上下文）。
cfg：后面的配置依靠它来传递，保存data-main的相关配置信息。
globalDefQueue：全局依赖队列，可以认为它是模块依赖加载的数据中转站。
isBrowser：确定是否在浏览器环境下。
defContextName：这是默认的上下文的名字，浏览器环境下只有这一个上下文。

2、接下来定义了一系列的工具函数，以及重要的newContext函数，并且初始化了一些API

1764-1833行都是对req的初始化过程，其中req=requirejs=require。

1）、1764-1798行初始化req与requirejs，1821-1823行初始化require。

2）、1804-1806行初始化req.config，即初始化了require.config这个API，这是暴露给外部的使用的requirejs的配置API。

3）、1814-1816行初始化req.nextTick，这个是接下来用于创建任务的重要函数，内部使用setTimeout。

3、1836行执行req({})创建默认上下文

比较重要的一点信息：创建了默认的上下文context，并且把这默认上下文保存在contexts中，后面所有API对这个默认context的访问都通过contexts，这个contexts正好是开头初始化的闭包环境数据结构之一；另外，创建了一个重要的函数，context.require=context.makeRequire()=localRequire，这个函数就是后面加载模块的核心入口。

4、接下来又是一系列的初始化与函数定义的过程

其中，1875-1991行，这一段定义了及其重要的两个函数，createNode与load（都是req的方法），其中createNode是创建html标签的地方，load是真正创建script标签加载js脚本的地方，load内部调用createNode创建script标签，并且load内部处理了事件onreadystatechange和load，它们的事件处理函数保证了模块执行的顺序性。

5、然后就是处理data-main的逻辑

2007行-2052行对应于data-main的处理逻辑，这里会获取data-mian的路径以及对应的js脚本名，然后保存在cfg中（cfg是开头初始化的闭包数据结构）。

6、紧接着就是define函数的定义了

2061-2126行对应于define的定义，至此所有的API初始化完成了，define，require，require.config。

7、最后执行了req(cfg)

如果定了data-main的话，这一次的执行就会有所不同了，cfg中有baseUrl（data-main对应的路径），还有deps（保存的是data-main中的js脚本），这一次执行会修改context的配置，其中baseUrl会被修改，并且会调用context.require加载cfg.deps中脚本模块。

如果没有定义data-main的话，这一次执行就不会有什么改变了。

这里要吐槽一下：

requirejs调用req，如果参数是config，那一定会在内部产生一次无意义的参数的context.require的调用，然后一次无意义的nextTick调用，这就不能在调用之前加一个条件判断吗？真的是....

我一开始在这里饶了好几次，半天找不到data-main到底是在哪一次nextTick产生的task上被加载的。

一些重要的函数的解析

1、一系列的工具函数

其中55-79行的两个函数each与eachReverse这两个函数，遍历数组（each从左至右，eachReverse从右至左），对每个元素调用对应的回调，如果回调返回为真，就停止遍历。这与foreach方法有所区别。

81-87行的hasProp与getOwn也是两个比较重要的工具函数，hasProp判断一个obj中是否具有某个Property，getOwn用于获取一个obj中property的值。

94-103行的eachProp与each和eachReverse类似，不过这里是遍历对象的可迭代property。

109-128行的mixin会将会将source中的property合并到target中，合并规则是source中有target中没有的property。

132-136行的bind会将obj与func绑定起来，类似于function.bind。

138-140行的scripts会获取html文档中的所有script标签，返回一个script数组。

2、req函数（require或者requirejs）

    req = requirejs = function (deps, callback, errback, optional) {

        //Find the right context, use default
        var context, config,
            contextName = defContextName;

        // Determine if have config object in the call.
        if (!isArray(deps) && typeof deps !== 'string') {
            // deps is a config object
            config = deps;
            if (isArray(callback)) {
                // Adjust args if there are dependencies
                deps = callback;
                callback = errback;
                errback = optional;
            } else {
                deps = [];
            }
        }

        if (config && config.context) {
            contextName = config.context;
        }

        context = getOwn(contexts, contextName);
        if (!context) {
            context = contexts[contextName] = req.s.newContext(contextName);
        }

        if (config) {
            context.configure(config);
        }

        return context.require(deps, callback, errback);
    };

    /**
     * Support require.config() to make it easier to cooperate with other
     * AMD loaders on globally agreed names.
     */
    req.config = function (config) {
        return req(config);
    };

这个函数是requirejs暴露的API，对应于require。

这个函数其实就做两件事：通过req加载依赖项或者通过req修改context的配置项。

加载依赖项通过context.require()函数这个加载入口。

设置配置项通过context.configure()函数这个加载入口。

其中req.config对应于require.config这个API，其内部就是最终调用req()完成配置。

    req.config = function (config) {
        return req(config);
    };

3、req.load函数，这个函数是载入模块的最终函数。

    req.createNode = function (config, moduleName, url) {
        var node = config.xhtml ?
                document.createElementNS('http://www.w3.org/1999/xhtml', 'html:script') :
                document.createElement('script');
        node.type = config.scriptType || 'text/javascript';
        node.charset = 'utf-8';
        node.async = true;
        return node;
    };


    req.load = function (context, moduleName, url) {
        var config = (context && context.config) || {},
            node;
        if (isBrowser) {
            //In the browser so use a script tag
            node = req.createNode(config, moduleName, url);

            node.setAttribute('data-requirecontext', context.contextName);
            node.setAttribute('data-requiremodule', moduleName);

            //Set up load listener. Test attachEvent first because IE9 has
            //a subtle issue in its addEventListener and script onload firings
            //that do not match the behavior of all other browsers with
            //addEventListener support, which fire the onload event for a
            //script right after the script execution. See:
            //https://connect.microsoft.com/IE/feedback/details/648057/script-onload-event-is-not-fired-immediately-after-script-execution
            //UNFORTUNATELY Opera implements attachEvent but does not follow the script
            //script execution mode.
            if (node.attachEvent &&
                    //Check if node.attachEvent is artificially added by custom script or
                    //natively supported by browser
                    //read https://github.com/requirejs/requirejs/issues/187
                    //if we can NOT find [native code] then it must NOT natively supported.
                    //in IE8, node.attachEvent does not have toString()
                    //Note the test for "[native code" with no closing brace, see:
                    //https://github.com/requirejs/requirejs/issues/273
                    !(node.attachEvent.toString && node.attachEvent.toString().indexOf('[native code') < 0) &&
                    !isOpera) {
                //Probably IE. IE (at least 6-8) do not fire
                //script onload right after executing the script, so
                //we cannot tie the anonymous define call to a name.
                //However, IE reports the script as being in 'interactive'
                //readyState at the time of the define call.
                useInteractive = true;

                node.attachEvent('onreadystatechange', context.onScriptLoad);
                //It would be great to add an error handler here to catch
                //404s in IE9+. However, onreadystatechange will fire before
                //the error handler, so that does not help. If addEventListener
                //is used, then IE will fire error before load, but we cannot
                //use that pathway given the connect.microsoft.com issue
                //mentioned above about not doing the 'script execute,
                //then fire the script load event listener before execute
                //next script' that other browsers do.
                //Best hope: IE10 fixes the issues,
                //and then destroys all installs of IE 6-9.
                //node.attachEvent('onerror', context.onScriptError);
            } else {
                node.addEventListener('load', context.onScriptLoad, false);
                node.addEventListener('error', context.onScriptError, false);
            }
            node.src = url;

            //Calling onNodeCreated after all properties on the node have been
            //set, but before it is placed in the DOM.
            if (config.onNodeCreated) {
                config.onNodeCreated(node, config, moduleName, url);
            }

            //For some cache cases in IE 6-8, the script executes before the end
            //of the appendChild execution, so to tie an anonymous define
            //call to the module name (which is stored on the node), hold on
            //to a reference to this node, but clear after the DOM insertion.
            currentlyAddingScript = node;
            if (baseElement) {
                head.insertBefore(node, baseElement);
            } else {
                head.appendChild(node);
            }
            currentlyAddingScript = null;

            return node;
        } else if (isWebWorker) {
           。。。。。。
        }
    };

rquire使用的是script标签去拿js，细心的同学会注意到node上设定了 async 属性（异步加载script标签），并且在标签上绑定了load等事件，而load事件对应的处理函数是context中的onScriptLoad，如下面所示，其最终调用了context中的completeLoad函数来做处理。

当文件loading完成后，则要做的主要工作是执行 completeLoad 事件函数，但是要注意的是这时候把script加载完成后，立即执行的是script标签内部的内容，执行完后才触发的 completeLoad事件处理函数。

onScriptLoad: function (evt) {
      //Using currentTarget instead of target for Firefox 2.0's sake. Not
      //all old browsers will be supported, but this one was easy enough
     //to support and still makes sense.
     if (evt.type === 'load' ||
        (readyRegExp.test((evt.currentTarget || evt.srcElement).readyState))) {
        //Reset interactive script so a script node is not held onto for
       //to long.
        interactiveScript = null;

     //Pull out the name of the module and the context.
        var data = getScriptData(evt);
        context.completeLoad(data.id);
        }
},

4、define函数

这个函数是requirejs暴露的核心API，通过它来定义模块。

define = function (name, deps, callback) {
  var node,
  context;
  //do for multiple constructor
  ......
  //If no name, and callback is a function, then figure out if it a
  //CommonJS thing with dependencies.
  if (!deps && isFunction(callback)) {
    deps = [];
    //Remove comments from the callback string,
    //look for require calls, and pull them into the dependencies,
    //but only if there are function args.
    if (callback.length) {
      callback
      .toString()
      .replace(commentRegExp, '')
      .replace(cjsRequireRegExp, function (match, dep) {
        deps.push(dep);
      });
      deps = (callback.length === 1 ? ['require'] : ['require', 'exports', 'module']).concat(deps);
    }
  }
  //If in IE 6-8 and hit an anonymous define() call, do the interactive work.
  if (useInteractive) {
    node = currentlyAddingScript || getInteractiveScript();
    if (node) {
      if (!name) {
        name = node.getAttribute('data-requiremodule');
      }
      context = contexts[node.getAttribute('data-requirecontext')];
    }
  }
  //add to queue line
  if (context) {
    context.defQueue.push([name, deps, callback]);
    context.defQueueMap[name] = true;
  } else {
    globalDefQueue.push([name, deps, callback]);
  }
};

这就是define函数，代码不是很多，但是新奇的东西却是有一个！！！那就是代码中对 callback.toString() 文本来进行 正则匹配 ，哇，这是什么鬼呢？我们看看这两个replace中的正则表达式是什么样的

commentRegExp = /(\/\*([\s\S]*?)\*\/|([^:]|^)\/\/(.*)$)/mg;
cjsRequireRegExp = /[^.]\s*require\s*\(\s*["']([^'"\s]+)["']\s*\)/g;

第一个正则是用来支掉callback中的注释的，而第二个正则是用来匹配callback.toString() 文本中的 require(.....) ，并将 ..... 这个字段push到deps中，这个方法是不是很变态？通过这个办法，实现了对依赖项的获取，高招。（这里又学到了，以前不明白toString有什么用，现在懂了，把对象转换成string之后，那就可以当做字符串来处理了。）

在define最后，会把define的参数作为一个数组扔到开头初始化的数据结构globalDefQueue中，在后续的解读中，可以看到，该数据结构被会context访问，并将其内部的数据转到context中，然后加载，所以globalDefQueue实际上就是一个数据中转中心。

5、context.require

这个函数是模块加载的核心函数，通过makeRequire()函数初始化，context.require实际指向localRequire。

            makeRequire: function (relMap, options) {
                options = options || {};

                function localRequire(deps, callback, errback) {
                    var id, map, requireMod;

                    if (options.enableBuildCallback && callback && isFunction(callback)) {
                        callback.__requireJsBuild = true;
                    }

                    if (typeof deps === 'string') {
                        if (isFunction(callback)) {
                            //Invalid call
                            return onError(makeError('requireargs', 'Invalid require call'), errback);
                        }

                        //If require|exports|module are requested, get the
                        //value for them from the special handlers. Caveat:
                        //this only works while module is being defined.
                        if (relMap && hasProp(handlers, deps)) {
                            return handlers[deps](registry[relMap.id]);
                        }

                        //Synchronous access to one module. If require.get is
                        //available (as in the Node adapter), prefer that.
                        if (req.get) {
                            return req.get(context, deps, relMap, localRequire);
                        }

                        //Normalize module name, if it contains . or ..
                        map = makeModuleMap(deps, relMap, false, true);
                        id = map.id;

                        if (!hasProp(defined, id)) {
                            return onError(makeError('notloaded', 'Module name "' +
                                        id +
                                        '" has not been loaded yet for context: ' +
                                        contextName +
                                        (relMap ? '' : '. Use require([])')));
                        }
                        return defined[id];
                    }

                    //Grab defines waiting in the global queue.
                    intakeDefines();

                    //Mark all the dependencies as needing to be loaded.
                    context.nextTick(function () {
                        //Some defines could have been added since the
                        //require call, collect them.
                        intakeDefines();

                        requireMod = getModule(makeModuleMap(null, relMap));

                        //Store if map config should be applied to this require
                        //call for dependencies.
                        requireMod.skipMap = options.skipMap;

                        requireMod.init(deps, callback, errback, {
                            enabled: true
                        });

                        checkLoaded();
                    });

                    return localRequire;
                }

                mixin(localRequire, {
                    isBrowser: isBrowser,

                    /**
                     * Converts a module name + .extension into an URL path.
                     * *Requires* the use of a module name. It does not support using
                     * plain URLs like nameToUrl.
                     */
                    toUrl: function (moduleNamePlusExt) {
                                 。。。。。。
                    },

                    defined: function (id) {
                        return hasProp(defined, makeModuleMap(id, relMap, false, true).id);
                    },

                    specified: function (id) {
                        id = makeModuleMap(id, relMap, false, true).id;
                        return hasProp(defined, id) || hasProp(registry, id);
                    }
                });

                //Only allow undef on top level require calls
                if (!relMap) {
                    localRequire.undef = function (id) {
                        //Bind any waiting define() calls to this context,
                        //fix for #408
                        takeGlobalQueue();

                        var map = makeModuleMap(id, relMap, true),
                            mod = getOwn(registry, id);

                        mod.undefed = true;
                        removeScript(id);

                        delete defined[id];
                        delete urlFetched[map.url];
                        delete undefEvents[id];

                        //Clean queued defines too. Go backwards
                        //in array so that the splices do not
                        //mess up the iteration.
                        eachReverse(defQueue, function(args, i) {
                            if (args[0] === id) {
                                defQueue.splice(i, 1);
                            }
                        });
                        delete context.defQueueMap[id];

                        if (mod) {
                            //Hold on to listeners in case the
                            //module will be attempted to be reloaded
                            //using a different config.
                            if (mod.events.defined) {
                                undefEvents[id] = mod.events;
                            }

                            cleanRegistry(id);
                        }
                    };
                }

                return localRequire;
            }

makeRequire闭包环境中定义了localRequire，并且给localRequire中通过mixin添加了一些新的properties，最后返回了localRequire。

localRequire这个函数是模块加载的入口函数，在内部通过intakequeue这个函数将globalDefQueue中的对应define拿到context中，然后通过req.nextTick创建了一个模块加载任务。

这个模块加载任务分为三部分，调用intakequeue；定义module对象，并利用这个module对象完成module的加载；最后调用checkLoaded。

6、checkLoaded函数

该函数主要就是用于确认所有模块是否加载完成，如果未加载完成是否超时，如果超时就报错。

函数代码就不放了，知道功能就行了。

7、module对象及其properies

这个module对象是及其重要的，它是requirejs对模块的抽象，是requirejs操作的基本单元，是精髓所在，这里就先不分析了，后续扣细节的时候，再来讨论它。