Docker build and test Tensorflow

CI_DOCKER_EXTRA_PARAMS="-e CI_BUILD_PYTHON=python3 -e TF_BUILD_ENABLE_XLA=1" tensorflow/tools/ci_build/ CPU bazel test //tensorflow/...

Building OPs in Tensorflow

In order to accelerate the machine learning operations, the implementation of C++ is preferred than python. In tensorflow, we can compile C++ and call the function in python runtime. The way is shown in [1]. First, we need to get the header directory of tensorflow to compile our code.

Python 3.5.2 (default, Nov 17 2016, 17:05:23)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> tf.sysconfig.get_include()

The complication is done like this

TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')

g++ -std=c++11 -shared -o -fPIC -I $TF_INC -O2

We use word2vec [2] as another example to show the flow

class SkipgramWord2vecOp : public OpKernel {
  explicit SkipgramWord2vecOp(OpKernelConstruction* ctx)
      : OpKernel(ctx), rng_(&philox_) {
    string filename;
    OP_REQUIRES_OK(ctx, ctx->GetAttr("filename", &filename));
    OP_REQUIRES_OK(ctx, ctx->GetAttr("batch_size", &batch_size_));
    OP_REQUIRES_OK(ctx, ctx->GetAttr("window_size", &window_size_));
    OP_REQUIRES_OK(ctx, ctx->GetAttr("min_count", &min_count_));
    OP_REQUIRES_OK(ctx, ctx->GetAttr("subsample", &subsample_));
    OP_REQUIRES_OK(ctx, Init(ctx->env(), filename));

    mutex_lock l(mu_);
    example_pos_ = corpus_size_;
    label_pos_ = corpus_size_;
    label_limit_ = corpus_size_;
    sentence_index_ = kSentenceSize;
    for (int i = 0; i < kPrecalc; ++i) {
      NextExample(&precalc_examples_[i].input, &precalc_examples_[i].label);

  void Compute(OpKernelContext* ctx) override {
    Tensor words_per_epoch(DT_INT64, TensorShape({}));
    Tensor current_epoch(DT_INT32, TensorShape({}));
    Tensor total_words_processed(DT_INT64, TensorShape({}));
    Tensor examples(DT_INT32, TensorShape({batch_size_}));
    auto Texamples = examples.flat<int32>();
    Tensor labels(DT_INT32, TensorShape({batch_size_}));
    auto Tlabels = labels.flat<int32>();
      mutex_lock l(mu_);
      for (int i = 0; i < batch_size_; ++i) {
        Texamples(i) = precalc_examples_[precalc_index_].input;
        Tlabels(i) = precalc_examples_[precalc_index_].label;
        if (precalc_index_ >= kPrecalc) {
          precalc_index_ = 0;
          for (int j = 0; j < kPrecalc; ++j) {
      words_per_epoch.scalar<int64>()() = corpus_size_;
      current_epoch.scalar<int32>()() = current_epoch_;
      total_words_processed.scalar<int64>()() = total_words_processed_;
    ctx->set_output(0, word_);
    ctx->set_output(1, freq_);
    ctx->set_output(2, words_per_epoch);
    ctx->set_output(3, current_epoch);
    ctx->set_output(4, total_words_processed);
    ctx->set_output(5, examples);
    ctx->set_output(6, labels);

  struct Example {
    int32 input;
    int32 label;

  int32 batch_size_ = 0;
  int32 window_size_ = 5;
  float subsample_ = 1e-3;
  int min_count_ = 5;
  int32 vocab_size_ = 0;
  Tensor word_;
  Tensor freq_;
  int64 corpus_size_ = 0;
  std::vector<int32> corpus_;
  std::vector<Example> precalc_examples_;
  int precalc_index_ = 0;
  std::vector<int32> sentence_;
  int sentence_index_ = 0;

  mutex mu_;
  random::PhiloxRandom philox_ GUARDED_BY(mu_);
  random::SimplePhilox rng_ GUARDED_BY(mu_);
  int32 current_epoch_ GUARDED_BY(mu_) = -1;
  int64 total_words_processed_ GUARDED_BY(mu_) = 0;
  int32 example_pos_ GUARDED_BY(mu_);
  int32 label_pos_ GUARDED_BY(mu_);
  int32 label_limit_ GUARDED_BY(mu_);

  void NextExample(int32* example, int32* label) EXCLUSIVE_LOCKS_REQUIRED(mu_) {

  Status Init(Env* env, const string& filename) {
    return Status::OK();

REGISTER_KERNEL_BUILDER(Name("SkipgramWord2vec").Device(DEVICE_CPU), SkipgramWord2vecOp);

namespace tensorflow {

    .Output("vocab_word: string")
    .Output("vocab_freq: int32")
    .Output("words_per_epoch: int64")
    .Output("current_epoch: int32")
    .Output("total_words_processed: int64")
    .Output("examples: int32")
    .Output("labels: int32")
    .Attr("filename: string")
    .Attr("batch_size: int")
    .Attr("window_size: int = 5")
    .Attr("min_count: int = 5")
    .Attr("subsample: float = 1e-3")

} // end namespace tensorflow

To compile,

TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')
g++ -std=c++11 -shared -o -fPIC -I $TF_INC -O2 -D_GLIBCXX_USE_CXX11_ABI=0, the word2vec C++ can be used as follows,

import tensorflow as tf 

word2vec = tf.load_op_library(os.path.join(os.path.dirname(os.path.realpath(__file__)), ''))

(words, counts, words_per_epoch, current_epoch, total_words_processed,
     examples, labels) = word2vec.skipgram_word2vec(filename=opts.train_data,
(opts.vocab_words, opts.vocab_counts,
opts.words_per_epoch) =[words, counts, words_per_epoch])

The question is why we can use “skipgram_word2vec”.
The answer is in the page [1] about naming in the tensorflow wrapper.

For example,

    .Input("string_tensor: string")
    .Output("output: out_type")
    .Attr("out_type: {float, int32} = DT_FLOAT");
Converts each string in the input Tensor to the specified numeric type.

“A note on naming: Inputs, outputs, and attrs generally should be given snake_case names. The one exception is attrs that are used as the type of an input or in the type of an input. Those attrs can be inferred when the op is added to the graph and so don’t appear in the op’s function. For example, this last definition of ZeroOut will generate a Python function that looks like:”

def string_to_number(string_tensor, out_type=None, name=None):
  """Converts each string in the input Tensor to the specified numeric type.

    string_tensor: A `Tensor` of type `string`.
    out_type: An optional `tf.DType` from: `tf.float32, tf.int32`.
      Defaults to `tf.float32`.
    name: A name for the operation (optional).

    A `Tensor` of type `out_type`.

Therefore, we have the following lines for skipgram_word2vec

    .Output("vocab_word: string")

will generate
def skipgram_word2vec(...):


The code is from Thanks to the upstream tensorflow/models. And nealwu‘s help at github

There is also a Chinese blog talking about Tensorflow’s OPs.

[1] Tensorflow Document

How does dynamic program generation work?

How does dynamic program generation work? by @BrianRoemmele

Answer by Brian Roemmele:

The secret to Viv is the system actually writes its own code. In contrast to any other similar system, it is a profound and monumental giant leap forward.

Dynamically Evolving Cognitive Architecture

The structure of the Voice First world is held together by Intelligent Agents.  Intelligent Agents use AI (Artificial Intelligence) and ML (Machine Learning) to decode volition and intent from an analyzed phrase or sentence.  The AI in most current generation systems like Siri, Echo and Cortana focuses on speaker independent word recognition and to some extent the intent of predefined words or phrases that have a hard coded connection to a domain expertise. 

Viv uses a patented [1] exponential self learning system as opposed to the linear programed systems currently used by systems like Siri, Echo and Cortana.  What this means is that the technology in use by Viv is orders of magnitude more powerful because Viv's operational software requires just a few lines of seed code to establish the domain [2], ontology [3] and taxonomy [4] to operate on a word or phrase. 

In the old paradigm each task or skill in Siri, Echo and Cortana needed to be hard coded by the developer and siloed in to itself, with little connection to the entire custom lexicon of domains custom programmed.  This means that these systems are limited to how fast and how large they can scale. Ultimately each silo can contact though related ontologies and taxonomies but it is highly inefficient. At some point the lexicon of words and phrases will become a very large task to maintain and update. Viv solves this rather large problem with simplicity for both the system and the developer.

Specimen of the developer console identifying domain intent programing.

Viv's team calls this new paradigm the "Dynamically evolving cognitive architecture system".  There is limited public information on the system and I can not address any private information I may have access to.  However the patent, "Dynamically evolving cognitive architecture system based on third-party developers" [5] published on December 24th, 2014 offers an incredible insight on the future.

Dynamically evolving cognitive architecture system based on third-party developers

US 20140380263 A1


A dynamically evolving cognitive architecture system based on third-party developers is described. A system forms an intent based on a user input, and creates a plan based on the intent. The plan includes a first action object that transforms a first concept object associated with the intent into a second concept object and also includes a second action object that transforms the second concept object into a third concept object associated with a goal of the intent. The first action object and the second action object are selected from multiple action objects. The system executes the plan, and outputs a value associated with the third concept object.

Some consumers and enterprises may desire functionality that is the result of combinations of services available on the World Wide Web or “in the cloud.” Some applications on mobile devices and/or web sites offer combinations of third-party services to end users so that an end user's needs may be met by a combination of many services, thereby providing a unified experience that offers ease of use and highly variable functionality. Most of these software services are built with a specific purpose in mind. For example, an enterprise's product manager studies a target audience, formulates a set of use cases, and then works with a software engineering group to code logic and implement a service for the specified use cases. The enterprise pushes the resulting code package to a server where it remains unchanged until the next software release, serving up the designed functionality to its end user population.

Viv Has Built An Easy Way For Developers To Build

This Viv patent is a landmark advance for Intelligent Agents and the resulting Voice First devices and uses case that will be developed on the platform.  The process for adding new domain experience is a simple process in the developer app.

To define a new intent, the domain is established by programing a horizontal flow chart that helps to define ontology and taxonomy with in the entire system.  The results are lines of code that will forever be dynamically changing and connecting as more domains of intent are established.  Viv literally programs itself.  This is process is related to self modifying code that has been around since the 1960s from assembly language to Cobol.  However the process that Viv uses is radically more advanced.

Specimen of the developer console identifying domain intent programing.  

The limitations we have all come to know with Siri, Echo and Cortana and the Chat Bots released with Facebook M are tied to the limitations of extending new intent domains and connecting new ontologies and taxonomies.  Not only does each intent domain need to be programed, from decoding a word or phrase, but these silos of intents need to some how connect when more complex sentences are created.  For example:

"(Siri-Alexa) I want to pick up a Pizza on the way to my girl friend's house and I would like to find a perfect wine to pick up along the way. Also would like to bring her flowers."

Currently Siri and Alexa could not understand the intent of this paragraph, nor could it easy connect to the six domains and many ontologies to produce a useful result.  Viv could learn this in a few minutes and constantly connect with new intent domains by expanding the ontological references each domain represents.

Another feature of Viv will be the user profile that defines:

Conversational intent-Understands What You Say :

– Location context

– Time context

– Task context

– Dialog context

Understands You — Learns and acts on personal information:

– Who are your friends

– Where do you live

– What is your age

– What do you like

You will set privacy fences around any information that Viv learns and you will be able to choose to allow the system to share this data with any intent domain.  Of course security and privacy will always be an issue with Intelligent Agents and Viv is working on a new model that will quickly define what is logically private and potentially shareable with permissions. 

Specimen of the current domain cloud.  Note the appearance of Payments and Money.

The  "Dynamically evolving cognitive architecture system based on third-party developers" explains the complexity involved this way:

Specimen flow chart from "Dynamically evolving cognitive architecture system based on third-party developers" patent.

FIG. 1 illustrates a block diagram of an example plan 100 created by a dynamically evolving cognitive architecture system based on third-party developers, in which action objects are represented by rectangles and concept objects are represented by ovals, under an embodiment. User input 102 indicates that a user inputs “I want to buy a good bottle wine that goes well with chicken parmesan” to the system. The system forms the intent of the user as seeking a wine recommendation based on a concept object 104 for a menu item, chicken parmesan. Since no single service provider offers such a use case, the system creates a plan based on the user's intent by selecting multiple action objects that may be executed sequentially to provide such a specific recommendation service. Action object 106 transforms the concept object 104 for a specific menu item, such as chicken parmesan, into a concept object 108 list of ingredients, such as chicken, cheese, and tomato sauce. Action object 110 transforms the list of ingredients concept object 108 into a concept object 112 for a food category, such as chicken-based pasta dishes. Action object 114 transforms the food category concept object 112 into a concept object 116 for a wine recommendation, such as a specific red wine, which the system outputs as a recommendation for pairing with chicken parmesan. Even though the system has not been intentionally designed to create wine recommendations based on the name of a menu item, the system is able to intelligently synthesize a way of creating such a recommendation based on the system's concept objects and action objects. Although FIG. 1 illustrates an example of a system creating a single plan with a linear sequence that includes three action objects and four concept objects, the system creates multiple plans each of which may include any combination of linear sequences, splits, joins, and iterative sorting loops, and any number of action objects and concept objects. Descriptions below of FIGS. 4, 5, and 6 offer examples of multiple non-linear plans with splits, joins, and other numbers of action objects and concept objects.

Just in this simple sentence "I want to buy a good bottle wine that goes well with chicken parmesan" there are dozens of intent domains that would be connected.  Viv's  can build a result dynamically even if this question has never been asked before.  Viv operates on the intent domains from the extracted words in the sentence and in real time constructs an answer.

Viv Labs will be opening the system up to developers and I am predicting a land rush similar to when Apple opened up the App store for the iPhone.

Payments Are The Foundation And Replace Advertising

Central to every Voice First system is Voice Commece and Voice Payments. The patent speaks to this in a unique way:

Content application program interface providers desire branding, to sell advertising, and/or to sell access to restricted content. Data providers and data curators want recognition, payment for all content, and/or payment for enhanced or premium content. Transaction providers desire branding and transactions via selling of some good or service. Advertisers desire traffic from qualified end users. A single person or organization may play more than one of these roles.

The shift from push mechanisms of the current adversing paradigm to the pull mechanisms of Voice Commerce will define the rise of Voice Payments. Currently even the most technically defined payments companies are not in the position to adapt to the new paradigms.

Viv Learns The Way Humans Learn

This type of system may sound very familiar to most of us, this is very close to the manner that humans learn. We assemble domains and form ontologies that connect intent.

On December 24th, 2014 when I first viewed the "Dynamically evolving cognitive architecture system based on third-party developers" patent from Six Five Labs, "goose bumps" ran through my entire body, for in that moment I saw what I have been studying since 1989 in my Voice manifesto. I spoke to this in some detail recently here on Quora [6].  In this Quora Knowledge Prize posting I detail how Voice First systems will completely change advertising, commerce and payments. I got into more details in the industry publication Tech.pinions [7].

Viv is the first system to pull together the right elements of speech recognition, speech synthesis, AI, ML, self modifying programs, commerce and payments in such a way that I assert in 10 years 50% of computer interactions will be via Voice primarily on Voice First devices.  The Viv we see today (May 9th, 2016) is one small step in this direction, but a giant leap for the future of computers.

[1] Patent US20140380263 – Dynamically evolving cognitive architecture system based on third-party developers

[2] Domain Ontologies in OSF

[3]  Ontology (information science)

[4] Taxonomy

[5] Patent US20140380263 – Dynamically evolving cognitive architecture system based on third-party developers

[6] Brian Roemmele's answer to Is Amazon Echo (and/or Siri and other voice assistants) actually useful, or is it just a novelty? Are usage and retention of these products growing?

[7] There is A Revolution Ahead and It Has A Voice

How does dynamic program generation work?

What machine learning methods make use of differential equations?

What machine learning methods make use of differential equations? by Lei Zhang

Answer by Lei Zhang:

For a specific example, to back propagate errors in a feed forward perceptron, you would generally differentiate one of the three activation functions:  Step, Tanh or Sigmoid.  (To clarify for those who don't know, a perception is a neural network, generally with a feed-forward, back propagating iteration; which means that the input and information to derive the final results only go forward, while the error is figured out after you arrive at your result, then goes backwards to each weights to determine which should change and by how much to reduce future errors).

Sigmoid function being

 and its derivative being

assuming beta is not one.  You can find the functions and their derivatives in wikipedia.

Artificial Neural Networks/Activation Functions

So basically if you want to use your own activation functions, you will need to find out your own derivatives.  These 3 work well for single hidden layer (or no hidden layer) perceptrons, but for what people are calling "deep learning", or more than one hidden layers, the error propagation can get complex.

You need to know why you are using these activation functions, and sometimes, you would choose one function over another purely for its derivative properties.  Error correction is half of the AI (some people will argue it's all the AI), if it does not correctly find which nodes/weights are responsible for the error, your AI will never learn (which is really not AI), so choosing a good function/functional derivative is very important, and this is where math is more important than programming.  Seriously, you can build an AI in less than 15,000 lines of code (C++).

Have fun

What machine learning methods make use of differential equations?

Not Attend DAC 2016

After two times of research paper presentation at DAC (DAC 2014 and DAC 2015), I will not attend DAC 2016 this year. The main reason is I did not submit any paper there. I did not submit to ICCAD 2016 either. But I do volunteer to review the papers for those two conferences.

The reason I stop submitting papers to conferences for a while is I have two journals (the journal version of MATEX and SPICE_Diego for Prof. Ernest Kuh) published in this half year, with another one and my thesis in preparation. Moreover, I have a software engineer job at ANSYS Apache. The technologies and real impact attract me more recently. (Maybe I will focus academic research sometime later.) Besides, I feel the privilege to work with the teams consist of legends in the area of design automation algorithms, e.g., the forerunner and researchers of AWE, the creators of MIT FastCap, CMU PRIMA, UT RICE, Synopsys PrimeTime.

I am kind of laid-back compared to previous years in VLSI CAD and EDA area, because I have been going for another path of adventure, which excites me quite a bit, more than traditional EDA. You might find some clues among the lines. I will keep low-key as usual (Am I 🙂 ). Wish the results turn out to be good in the future, so that I can feel proud to update just like the products I help to build and follow, such as the industrial golden chip sign-on software – Redhawk, the first machine learning and big data platform in EDA – Seascape and the in-design power simulation tool – Seahawk.

My friends, have a good time at DAC 2016 and Austin.




My manager at ANSYS Apache Dr. Steven P. McCormick was a MIT grad working with Professor Jonathan Allen. How awesome was that the lab built a speech machine, which was used by physicist Stephen Hawking! Then, they also developed the technique and inspired the well-known AWE in EDA area. Check the references and acknowledgement from the papers or not. I do not want to emphasize this, they were quite laid-back. So went from computational linguist, speech processing smoothly to VLSI and EDA.