Will they blend? Today: A Cross-Platform Ensemble Model: R meets Python and KNIME. Embrace Freedom in the Data Science Lab

In this blog series we’ll be experimenting with the most interesting blends of data and tools. Whether it’s mixing traditional sources with modern data lakes, open-source devops on the cloud with protected internal legacy tools, SQL with noSQL, web-wisdom-of-the-crowd with in-house handwritten notes, or IoT sensor data with idle chatting, we’re curious to find out: will they blend? Want to find out what happens when IBM Watson meets Google News, Hadoop Hive meets Excel, R meets Python, or MS Word meets MongoDB?

Today: A Cross-Platform Ensemble Model: R meets Python and KNIME. Embrace Freedom in the Data Science Lab

The Challenge

Today’s challenge consists of building a cross-platform ensemble model. The ensemble model must collect a Support Vector Machine (SVM), a logistic regression, and a decision tree. Let’s raise the bar even more and train these models on different analytics platforms: R, Python, and of course KNIME. (Note that we, of course, could create all those models in KNIME but that would kill the rest of the story...)

A small group of three data scientists was given the task to predict flight departure delays from Chicago O’Hare (ORD) airport, based on the airline data set. As soon as the data came in, all data scientists built a model in record time. I mean, each one of them built a different model on a different platform! We ended up with a Python script to build a logistic regression, an R script to build an SVM, and a KNIME workflow to train a decision tree. Which one should we choose?

We had two options here: select the best model and claim the champion; embrace diversity and build an ensemble model. Since more is usually better than less, we opted for the ensemble model. Thus, we just needed to convince two out of the three data scientists to switch analytics platform.

Or maybe not.

Thanks to its open architecture, KNIME can easily integrate R and Python scripts. In this way, every data scientist can use his/her preferred analytics platform, while KNIME collects and fuses the results.

Today’s challenge has three main characters: a decision tree built on KNIME Analytics Platform, an SVM built in R, and a logistic regression built with Python. Will they blend?

Topic. Flight departure delays from Chicago O’Hare (ORD) Airport.

Challenge. Build a cross-platform ensemble model, by blending an R SVM, a Python logistic regression, and a KNIME decision tree.

KNIME Extensions. Python and R Integrations.

Will they blend? Today: A Cross-Platform Ensemble Model: R meets Python and KNIME. Embrace Freedom in the Data Science Lab

Today: A Cross-Platform Ensemble Model: R meets Python and KNIME. Embrace Freedom in the Data Science Lab

The Challenge

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

IIS 観点でアンチウイルススキャン対象から除外したいフォルダ

18A St. Fintan's Villas, Deansgrange, Co. Dublin - €365,000

बिना कपड़े उतारे भी लें सकते हैं सेक्स का मज़ा, ट्राई करें ये नया तरीकाबिना...

M23 northbound reopened after lorry fire causes chaos

मतलबी दुनिया स्टेटस – Matlabi Duniya Status in Hindi | Selfish Status

Ndola Headteacher video goes viral(Video)

Not much punishment for substantial benefit fraud

Sarangapur Mandal Sarpanch | Upa-Sarpanch | Ward member Mobile Numbers List...

Change text color of a mushroom title card

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

[MP3] Texzy Ft Dr. Ritzy –“Leg Over” (Prod. @DrRitzy & @KezzyKlef)

Error 0x80070299 copying file to ReFS

Breaking Down Bumpy’s Boys: NYC Black Mob Boss Of Old Surrounded Himself With...

Suspected burglar to know fate in January

99 God Status for Whatsapp, Facebook

Attharintiki Daaredhi: Bappu Gari Bommo Lyrics Translation

Rajasthan Board 12th Science Result 2018 name wise- RBSE 12th commerce result...

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Cattivissimo.Me.3.2017.iTALiAN.MD.WEBDL.XviD-iSTANCE Seed (318)/Leech (148)