Github social conditioning

Microsoft Github isn’t about social coding anymore: it is all for social conditioning now.

Yesterday Microsoft made another clear-cut action of posturing and virtue-signaling (much like their decision of not to sell certain technology to police departments). This time around they declared war on English language and made it clear to everybody that they are in a position to decide the meaning of the words for the rest of us. In particular, for the software development communities. Yes, I am talking about the contraversial and illogical step to remove common term “master” from their popular version control system Github.

Today, after almost ten years of being a customer, I have deleted my Github account and completely moved elsewhere. If you’re thinking about doing the same you can find ample alternatives from bitbucket.org to jetbrains.space with many options in between.

Replacing words in programming languages or changing their meaning has nothing to with social justice or better world. It is how big tech companies are flexing their muscles and exercise their control over software development crowd.

Submission or outright genocide through commercial meanings is what Microsoft was doing to Free Software and later Open-Source Software for years. This is what Microsoft keeps doing no matter how many times their management will say “we embraced open-source” or “we admit our mistakes in the past” – they still have the same agenda and they still do everything they can to submit open-source development to their command. Massive contributions into Linux Foundation and Apache Software Foundation are just that: a tactical moves to setup people who would be doing their bidding for them.

Perhaps open-source developers and other software professionals would hear this and hit Microsoft back exactly where it hurts: their P&L, user base and influence they should no longer have.

Hey LinkedIn – you’re next!

Bad code can ruin you life

How ‘no skin in the game’ can lock down the whole world

Update May 22, 2020 1600 GMT
And I see there’s a request under Freedom of Information Act, asking ICL to release the original version of the code, that has been used to produce Report-9’s numbers. The request has been acknowledged by ICL and now we are all jumping at the edge of out seats to see they answer by… wait for it… June 22, 2020. Surely, they are taking their sweet time and somehow I don’t hold by breath for any positive outcome here.

Update May 20, 2020 1400 GMT
The defendants of ICL code are trying to avoid at all cost the direct requests to release the original code that went into infamous Report-9. Why? Is there _anything_ to hide? How heavy the groomed version on Github has indeed been groomed?

The latest turn of escalation shows the real colors and true demagoguery of these fellas: the whole spectrum from calling their opponents “lockdown skeptics” to accusing them to be “ideologically motivated”.

Whatever they are trying to achieve with these words – to diminish constructive criticism, perhaps? – they are using the oldest trick of boulevard presstitutes in the book – if you can’t attack an argument, attach the men behind it!

Update May 18, 2020 1400 GMT
It is real fun: my favorite financial blog Zerohedge has picked up the story. For 12 years I was reading these guys, ever since the financial crisis of 2008 hit the world. And now they are writing about me. Small world 😉

Update May 18, 2020 0900 GMT
A few commentators were trying to make an issue that I was bashing the FORTRAN for its inaccuracies and age. Nothing could be farther from the truth. The point was – and it is impossible to explain in an non-technical editorial – that running old FORTRAN code in a modern C- or C++ compiler (and that what they were doing, actually) is likely to have a lot of rounding effects unless you’re very carefully choose the level of optimizations the compiler is allowed to perform.
Clearly, the fact that Imperial’s code wasn’t consistent even in case of a single-threaded runs is showing that something was wrong with the way their numerical manipulations were done. Sapienti sat.

Just this morning, Sunday Telegraph has posted the article “Imperial model could be the most devastating software mistake of all time”[1] that we penned with my colleague and old friend David.

Unless you were living on the Moon or Sweden the chances are you’re either put on strict quarantine or “stay at home” California style isolation, or Thailand’s senseless beaches closure and “tambon lockdown”. At any rate, your life has turned into something completely different, where your customary daily routines are thrown out of wack, you have to rearrange you life to fit into work from home paradigm if you aren’t one of those 37 million people in the US who already lost their jobs.

At any rate, some of the comments were quick to point out that we aren’t epidemiologists nor virologists and don’t understand that the world is stochastic, so a model’s non-determinism is completely ok as you can average out the dispersed results and still get a somewhat correct estimates in the end. Indeed, we aren’t. But other epidemiologists argue that model is flawed. BTW, you can read the whole Sunday’s double page in this PDF file.

Clearly, there is a gap in understanding that world’s non-determinism has a very different nature than that of pseudo-random behavior of a computer model with fixed set of input parameters. I won’t be repeating the analysis and great insights put together by [2], [3], [4] – you can read them on your own time.

I’d rather wanted to point out one thing: the lack of accountability or what is commonly know as “no skin in the game” allows people to make all sorts of out right lousy predictions and policy decisions (yes, “Dr.” Fauci – I am talking about you: have you seen a real patient in the last 30 years, “doctor”?) having disastrous effects on millions of people and shutting down multi-trillion dollars economies without any understanding of how to get out of the mess and why they did it in the first place [5].

David and I had a lot of fun talking about this and some other things in the podcast episode below, which you might enjoy.

Dr Yeturu Aahlad From Chennai, india to Silicon Valley and ground breaking research in distributed computing The Rebelutionaries

Dr Yeturu Aahlad or simply Aahlad as he is better known was born in Chennai (formerly Madras), India he studied his undergrad at IIT Madras in Electrical Engineering. He went on to gain a PhD in Computer Science, from The University of Texas, Austin where he studied with some of the greats in computer science, including one Edsger W. Dijkstra. Dijkstra is in fact credited as being a major driving forces behind the acceptance of computer programming as a scientific discipline. Hired by IBM as an intern after a noisy critique at an academic conference he filed 3 patents over a few months in the summer. He went on to become distributed systems architect at SUN Microsystems where he authored parts of CORBA (I remember someone once asking Aahlad how he knew CORBA couldn’t do what we do and his answer was the best ever “because I wrote both”.) In 2005 we founded WANdisco together on the back of 5 years of research that Aahlad had completed on reliable and available distributed systems, which led to his discovery of the Paxos algorithm, still not widely known at that time. He has directly authored over 30 patents in distributed computing (among others) and, in my opinion, is one of the world’s greatest computer scientists.
  1. Dr Yeturu Aahlad From Chennai, india to Silicon Valley and ground breaking research in distributed computing
  2. Imperial College’s modelling Covid-19 which helped persuade the UK and other countries to bring in draconian lockdowns will supersede the failed Venus space probe to go down in history as the most devastating software mistake of all time
  1. If you want to read the complete article you can use Linux terminal browser like links or simply disable scripts for that page (that is if you use some advanced browsers like Brave).
  2. Imperial’s Model code review
  3. Second code review of Imperial’s Model (walk of shame continues)
  4. 15,000 lines of C-code in a single file
  5. While at it, think how much of this disaster could have been avoided if the model and the implementing code were peer-reviewed in a truly open-source fashion?

Freshening up my business web-site

As some of you might know, I was running this consulting gig for a while as a side business. Recently, I decided to invest more time and attention to it, perhaps even making it my full time vocation.

Going forward, I am going to post more content at the new site, but this one will remain active for archiving/historical purposes and things that aren’t suited to be posted on the business website.

Please come and visit me at c-systems.me

Cool FOSS’ heads prevail once again

As you have seen in my last post or elsewhere, Facebook has recently added a dubious patent clause in the license of  multiple projects including ReactJS. And predictably, a number of organizations, companies, and open-source advocates made it clear that it’s way too dangerous to keep on using the code with such restrictions because of possible legal repercussions.

Well, I am pleased to tell to all my readers, that they have back-tracked on this after Apache Foundation, WordPress, and many others have express their clear intention of switching to safe alternatives to React.js and other frameworks from FB, or banning their use. As you all know, FOSS is a free market ecosystem; it is thriving from the forces of intellectual competition, always offering multiple choices to its users. And this approach won again: facing the danger of loosing their user base and, effectively, rendering themselves irrelevant, they made the decision to, once again, re-license some of their projects under MIT.

Namely, ReactJS will be released under the new license. So if you are using it – make sure to update your dependencies to v.16 once it is out next week. Remember, re-licensing isn’t usually retroactive, so don’t fall into that trap.

Disclaimer: I am not using, planning nor recommending to use any Facebook’s sponsored projects

And let the Dao be with you, as usual 😉

Facebook licensed code is kicked out

In somewhat recent revelation about the pitfalls of infamous Facebook “BSD + Patents” license, FOSS developers becoming more acutely aware and alarmed about the consequences.

I won’t bother you with much details, as they are readily available elsewhere. I just want to point out that Facebook is hedging their open-source “exposure”. What they are effectively saying is “Go ahead and use our awesome stuff. But if we ever decide that you’re competing with us, we’ll yank your licence to use our frameworks so fast your shoes will fall off.” It doesn’t matter if someone has developed this code for you: you won’t be able to use it anyway.

That’s the essence. It is the original intention of the license behind ReactJS and a few other frameworks. And that’s why Apache Foundation has moved the license to Cat-X, prohibiting any of its projects to touch things like ReactJS. Facebook software is NOT compatible with the projects developed under widely accepted and respected ALv2.

Here’s the excerpt:

Facebook BSD+Patents license

The Facebook BSD+Patents license includes a specification of a PATENTS file that passes along risk to downstream consumers of our software imbalanced in favor of the licensor, not the licensee, thereby violating our Apache legal policy of being a universal donor. The terms of Facebook BSD+Patents license are not a subset of those found in the ALv2, and they cannot be sublicensed as Alv2.

These are the unintended consequences of meddling with well thought open-source software licenses. That is the beauty of open-source: if you trying to lock people in or out – they will move. It doesn’t matter how much money you have, how big you are, nor what your SJW position is. Developers will go, and the users will as well.

I’m sure we haven’t heard the last of it yet. And that’s the damning and loud application of the golden rule!

Gab.ai was kicked from GooglePlay…

But who cares… All you need to do, is go the Gab’s website, and right there on the left side a link to where you can grab and install the .apk package for the mobile app

Screen Shot 2017-08-18 at 17.01.54

Make sure your phone’s settings allow to install applications from “Unknown sources” (I will let you figure out how to do it ;), and vu’a la. Enjoy!

 

Finally, I have moved away from Google!

As you might have noticed, my blog is no longer hosted on Blogger.com (actually Google).

I did it for two reasons:

  1. I was planning on it for a long time because of somewhat mediocre functionality of the Blogger.
  2. What was the last straw is the Google’s reaction to the intellectual argument of James Damore (if you aren’t familiar with the story, it means you probably were a part of the first Mars expedition). I cannot trust my content to a company that suppresses the free speech in full disregard to the individual rights, protected by the law of the land.

I made an effort to make sure that old URLs are working properly and redirect you to the new location. That should take care about cached searches and bookmarks. If you notice that something is missing – please let me know, so I can fix it ASAP.

Anyway, now I am here, check back soon for new articles!

Apache Ignite vs Alluxio (former Tachyon)

Well, ever since the company behind the read-only open-source project called Tachyon has decided to change the name of the project, I was puzzled. If you build something successful, you want the name of it to be recognized, right? In marketing, it is called “brand recognition”.

Why would Coca-Cola rename their product into SludgeWaters? Indeed, it doesn’t make much sense! The most infamous brand-recognition screw-up was when SUNW (Sun Microsystems) got renamed to JAVA on the NASDAQ. And _that_ ended well, for sure. The brilliant idea belonged to the Silicon Valley class-clown with the pony-tail. I am sure you know, whom I refer to.

At any rate, why an allegedly successful software project would change its name in a middle of the rise? I have a hypothesis, that it has been caused by the fact that any time one searches for Tachyon on Google (or elsewhere), the first link popping-up would be to my blog from last year and the close second would point to the story how Tachyon BDFL has decided to remove my benign answer from their public mail list.

So, in the interest of the history preservation, I am putting up the new one, but correcting the name to reflect new reality of Alluxio project. The technical findings stand the same, so just go and read the year old blog to figure where the old application with the new name is falling short.

The last but not least, since the time of the original write-up, Apache Ignite has graduated to Apache TLP project, that’s why the “(incubating)” suffix is dropped as well 😉

Let’s speed up Apache Hive with Apache Ignite & Apache Bigtop

Today we will be looking into how we can speed Hive using Apache Ignite. For this particular exercise I will be using Apache Bigtop stack v1.0 because I don’t care wasting my time with manual cluster setting; nor I do want to use any of the overly complex stuff like Cloudera Manager or Ambari. I am a Unix command-line guy, and CLI leaves all these fancy yet semi-backed contraptions biting the dust. Let’s start.

For the simplicity I’d suggest to use docker. If you don’t know how to use docker you can do the same on your own system and clean the mess later. Or better yet – learn how to use docker (if you’re on Mac – you’re on your own!). Despite all the hype around it, it is still a useful tool in some cases. I’ll be using one from an official Bigtop Ubuntu-14.04 image:

% sudo docker run -t -i -h ‘bigtop1.docker’ bigtop/slaves:ubuntu-14.04 /bin/bash
% git clone https://git-wip-us.apache.org/repos/asf/bigtop.git
% cd bigtop
Now you can follow bigtop-deploy/puppet/README.md on how to deploy your cluster. Make sure you have selected hadoop, yarn, ignite-hadoop, and hive while editing /etc/puppet/hieradata/site.yaml (as specified in the README.md). Once puppet apply command is finished you should have a nice single node cluster, running HDFS, YARN, and ignite-hadoop w/ IGFS. Hive should be configured and ready to run. Let’s do a couple more steps to get the data in place and ready for the experiments:

;; http://hortonworks.com/hadoop-tutorial/how-to-process-data-with-apache-hive/
;;
;; wget http://seanlahman.com/files/database/lahman591-csv.zip
;; unzip -x lahman591-csv.zip
;; First set the data
;;   hadoop fs -copyFromLocal Batting.csv /user/hive
;;   hadoop fs -copyFromLocal Master.csv /user/hive
and now to Hive. Make sure it is executed with proper configuration to take advantage of in-memory data fabric provided by Apache Ignite. Let’s start Hive CLI to work with Ignite cluster, set the tables and run some queries:

% HADOOP_CONF_DIR=/etc/hadoop/ignite.client.conf hive cli
create table temp_batting (col_value STRING);
create table batting (player_id STRING, year INT, runs INT)
LOAD DATA INPATH ‘/user/hive/Batting.csv’ OVERWRITE INTO TABLE temp_batting;
insert overwrite table batting
SELECT
  regexp_extract(col_value, ‘^(?:([^,]*)\,?){1}’, 1) player_id,
  regexp_extract(col_value, ‘^(?:([^,]*)\,?){2}’, 1) year,
  regexp_extract(col_value, ‘^(?:([^,]*)\,?){9}’, 1) run
from temp_batting;
SELECT COUNT(*) FROM batting WHERE year > 1909 AND year <= 1969;
;; let’s do something more real
SELECT a.year, a.player_id, a.runs from batting a
    JOIN (SELECT year, max(runs) runs FROM batting GROUP BY year ) b
    ON (a.year = b.year AND a.runs = b.runs) ;

Notice the times of both queries.
Quit the hive session and restart it with standard config to run on top of YARN:

% hive cli

;; All the tables are still in place, so let’s just repeat the queries:
SELECT COUNT(*) FROM batting WHERE year > 1909 AND year <= 1969;
SELECT a.year, a.player_id, a.runs from batting a
    JOIN (SELECT year, max(runs) runs FROM batting GROUP BY year ) b
    ON (a.year = b.year AND a.runs = b.runs) ;

Once again: notice the execution times and appreciate the difference! Enjoy!