Thursday, October 16, 2014

SAS In-Memory Statistics (IMSTAT) for Hadoop Overview

SAS® In-Memory Statistics for Hadoop ("SASIMSH")  is a single interactive programming environment for analytics on Hadoop that  integrates analytical data preparation, exploration, modeling and deployment. It contains PROC IMSTAT and  SAS LASR Analytic Engine (SASIOLA). SASIOLA essentially is 'BASE' for SASIMSH, whereas PROC IMSTAT covers the full analytical cycle from BASE, through modeling, towards deployment. This duality continues SAS's long standing tradition of 'getting the same job done in different ways', to accommodate users' different style, constraints  and preferences.
This post provides overview of IMSTAT, with a little associated coverage of the SASIOLA facilities. 

You certainly read through IMSTAT details here at sas.com. Below is a summary picture many have liked better, to capture features and functions of PROC IMSTAT. It covers the latest as of Q1 of 2014, but the spirit and gist remain the same since.




Some comments about the 'total concept' first
1. If you are familiar with SAS products and solutions, you are used to seeing BASE (programming), STAT (statistical sciences), ETS (Econometrics), OR (operations research), EM (enterprise data mining including machine learning, text mining and statistics), EG (enterprise guide) and MM (model management). Another line of SAS in-memory products still largely follow this set of convention. For example, HP Statistics (high performance counterpart of STAT), HPDM (high performance counterpart of EM) and so on. You are used to seeing long list of procedures under each product or package. 

Now, conceptual 'shock' #1 is all these features listed in this IMSTAT picture are grouped under ONE procedure. Yes, IMSTAT is one procedure and one procedure only, with so many features

2. Why this change?

If you use any of the traditional SAS products mentioned above, you know to get the work on hand done, you likely engage a very small set of procedures, functions and statements afforded by a specific product that you have license for. For example, I myself have been using STAT since ~1991, but still ~ half of the procedures under STAT remain stranger to me. I don't recall having known anybody who uses all of the BASE capabilities either. On the contrary, I know friends who have held SAS jobs for many years. They are experienced, but have only known just half a dozen procedures.
The reality though is it is not cost possible for a software developer to build just a few procedures for one company and build another small set for another company.

One way SAS has to address this (price and value) gap is software on demand offerings, in-depth discussion of which is beyond the scope of this post. Another way is to redesign package in such a way that all the essential features and functions to get analytical jobs done are built in and integrated. The next immediate question is: which to pick and chose from which existing packages? Apparently, from elementary 'can do' perspective, it is hard to imagine many things that SAS cannot do with its existing offerings. In many cases, the challenge is how, not if. Still, a coherent organizing theme is needed to build the new piece. Good news is such piece has existed for many years.


The BLUE spoke in the center of the picture presents a diagram of modern analytical life cycle, from problem definition, data preparation and exploration, through modeling, to deployment and presentation. PROC IMSTAT features and functions are organized and developed by this framework. In other words, PROC IMSTAT has collected core functions and features from SAS software families, to optimize against needs and challenges confronting analytical users in Hadoop world. Many 'pre-existing' features have been distilled and streamlined while being moved to the in-memory platform. 


PROC IMSTAT is expanding rapidly to accommodate ever changing Hadoop world.


Some comments about PROC IMSTAT

1. Major pure feature and function addition actually are on the right side, the recommender engine. Everything else essentially has been in existence in SAS software family in some format or style.
2. All the entries listed under a box header label (such as Data Management to the top left corner) are IMSTAT statements. To access the statements, the user must invoke "PROC IMSTAT" first. Unlike many traditional SAS procedures where one has to invoke procedures many times, once PROC IMSTAT is invoked once, the user can invoke the same statement again and again, as the job deems necessary
3. User who are very familiar and deep on some SAS procedures may find that some features, reduced from a regular SAS procedure (for example, CORR statement stems from PROC CORR) to IMSTAT statement, no longer have that many options as their counterpart procedures have. This, in part, is because the procedure has been reduced to a statement. Another reason is more strategic design; the reduction or left-out is intentional: do you really think it makes sense to run all those distance options under the cluster statement on now much bigger data sets?
4. Some statements actually are mini-solution.  GroupBy statement, for example, is "in-memory cube builder " or a genuine OLAP killer, while it appears like a small statement

I plan to publish specific use case to help better understand how IMSTAT works. Thanks.

October 2014, from Wellesley, Massachusetts 

59 comments:

  1. There are lots of information about latest technology and how to get trained in them, like Hadoop training institutes in chennai have spread around the web, but this is a unique one according to me. The strategy you have updated here will make me to get trained in future technologies(Hadoop Training in Chennai). By the way you are running a great blog. Thanks for sharing this.

    Big Data Hadoop Training in Chennai | Hadoop Course in Chennai

    ReplyDelete
  2. Managing a business data is not an easy thing, it is very complex process to handle the corporate information both Hadoop and cognos doing this in a easy manner with help of business software suite, thanks for sharing this useful post….
    Regards,
    cognos tm1 Training in Chennai|cognos Certification|cognos Training in Chennai

    ReplyDelete
  3. A table is the basic unit of data storage in an oracle database. The table of a database hold all of the user accesible data. Table data is stored in rows and columns. But what is all about the clusters and how to handle it using oracle database system? Expecting a right answer from you. By the way you are maintaining a great blog. Thanks for sharing this in here.
    Oracle Training in Chennai | Oracle Course in Chennai | Oracle Training Center in Chennai

    ReplyDelete
  4. Maharashtra Police Patil Recruitment 2016

    Hi everyone, it’s my first visit at this site, and post is genuinely fruitful for me, keep up posting these types of articles..........

    ReplyDelete
  5. I understand your blog this can help me to analyses the SAS oriented concepts.This can increasing the volume data of sas analytics. Thanks for sharing this blog.


    SASTraining in Bangalore

    ReplyDelete
  6. This comment has been removed by the author.

    ReplyDelete
  7. Thank you so much for sharing this worth able content with us. The concept taken here will be useful for my future programs and i will surely implement them in my study. Keep blogging article like this.

    SAS Online Training

    ReplyDelete
  8. informative post! I really like and appreciate your work, thank you for sharing such a useful facts and information about capability procedure hr strategies, keep updating the blog, hear i prefer some more information about jobs for your career hr jobs in hyderabad .

    ReplyDelete


  9. Thank you for your post. This is excellent information. It is amazing and wonderful to visit your site.
    sas installation and configuration service

    ReplyDelete

  10. Thank you for your post. This is excellent information. It is amazing and wonderful to visit your site.
    sas implementation services in north America

    ReplyDelete
  11. Thank you for your post. This is excellent information. It is amazing and wonderful to visit your site.
    sas consulting services in usa

    ReplyDelete
  12. Nice post ! Thanks for sharing valuable information with us. Keep sharing.
    microsoft installation and configuration services

    ReplyDelete
  13. Thanks for posting such a great article.you done a great job core Java online training Hyderabad

    ReplyDelete
  14. 3D Animation Training in Noida

    Best institute for 3d Animation and Multimedia

    Best institute for 3d Animation Course training Classes in Noida- webtrackker Is providing the 3d Animation and Multimedia training in noida with 100% placement supports. for more call - 8802820025.

    3D Animation Training in Noida

    Company Address:

    Webtrackker Technology

    C- 67, Sector- 63, Noida

    Phone: 01204330760, 8802820025

    Email: info@webtrackker.com

    Website: http://webtrackker.com/Best-institute-3dAnimation-Multimedia-Course-training-Classes-in-Noida.php

    ReplyDelete
  15. Your good knowledge and kindness in playing with all the pieces were very useful. I don’t know what I would have done if I had not encountered such a step like this.

    rpa Training in Chennai

    rpa Training in bangalore

    rpa Training in pune

    blueprism Training in Chennai

    blueprism Training in bangalore

    blueprism Training in pune

    rpa online training

    ReplyDelete
  16. Great post! I am actually getting ready to across this information, It’s very helpful for this blog.Also great with all of the valuable information you have Keep up the good work you are doing well.

    automation anywhere training in chennai

    automation anywhere training in bangalore

    automation anywhere training in pune

    automation anywhere online training

    blueprism online training

    rpa Training in sholinganallur

    rpa Training in annanagar

    iot-training-in-chennai

    ReplyDelete
  17. This is a nice article here with some useful tips for those who are not used-to comment that frequently. Thanks for this helpful information I agree with all points you have given to us. I will follow all of them.
    java training in tambaram | java training in velachery

    java training in omr | oracle training in chennai

    java training in annanagar | java training in chennai

    ReplyDelete
  18. Thanks you for sharing this unique useful information content with us. Really awesome work. keep on blogging
    python training institute in chennai
    python training in velachery
    python training institute in chennai

    ReplyDelete
  19. Awesome..You have clearly explained …Its very useful for me to know about new things..Keep on blogging..
    Blueprism training in Chennai

    Blueprism online training

    Blue Prism Training in Pune

    ReplyDelete
  20. Awesome! Education is the extreme motivation that open the new doors of data and material. So we always need to study around the things and the new part of educations with that we are not mindful.
    python online training
    python training in OMR
    python training in tambaram

    ReplyDelete
  21. This is an awesome post.Really very informative and creative contents. These concept is a good way to enhance the knowledge.I like it and help me to development very well.Thank you for this brief explanation and very nice information.Well, got a good knowledge.
    Devops training in sholinganallur
    Devops training in velachery

    ReplyDelete
  22. Fantastic work! This is the type of information that should follow collective approximately the web. Embarrassment captivating position Google for not positioning this transmit higher! Enlarge taking place greater than and visit my web situate

    angularjs-Training in chennai

    angularjs Training in chennai

    angularjs-Training in tambaram

    angularjs-Training in sholinganallur

    angularjs-Training in velachery

    ReplyDelete
  23. This is an awesome post.Really very informative and creative contents. These concept is a good way to enhance the knowledge.I like it and help me to development very well.Thank you for this brief explanation and very nice information.Well, got a good knowledge.
    python training Course in chennai | python training in Bangalore | Python training institute in kalyan nagar

    ReplyDelete
  24. Your new valuable key points imply much a person like me and extremely more to my office workers. With thanks; from every one of us.
    iosh course in chennai

    ReplyDelete
  25. Sap fico training institute in Noida

    Sap fico training institute in Noida - Webtrackker Technology is IT Company which is providing the web designing, development, mobile application, and sap installation, digital marketing service in Noida, India and out of India. Webtrackker is also providing the sap fico training in Noida with working trainers.


    WEBTRACKKER TECHNOLOGY (P) LTD.
    C - 67, sector- 63, Noida, India.
    F -1 Sector 3 (Near Sector 16 metro station) Noida, India.

    +91 - 8802820025
    0120-433-0760
    0120-4204716
    EMAIL: info@webtrackker.com
    Website: www.webtrackker.com

    ReplyDelete
  26. I prefer to study this kind of material. Nicely written information in this post, the quality of content is fine and the conclusion is lovely. Things are very open and intensely clear explanation of issues
    Microsoft Azure online training
    Selenium online training
    Java online training
    uipath online training
    Python online training


    ReplyDelete
  27. After seeing your article I want to say that the presentation is very good and also a well-written article with some very good information which is very useful for the readers....thanks for sharing it and do share more posts like this.
    devops online training

    aws online training

    data science with python online training

    data science online training

    rpa online training

    ReplyDelete
  28. Hello, I read your blog occasionally, and I own a similar one, and I was just wondering if you get a lot of spam remarks? If so how do you stop it, any plugin or anything you can advise? I get so much lately it’s driving me insane, so any assistance is very much appreciated.
    Android Training in Chennai | Best Android Training in Chennai
    Matlab Training in Chennai | Best Matlab Training in Chennai
    Best AWS Training in Chennai | AWS Training in Chennai
    Selenium Training in Chennai | Best Selenium Training in chennai
    Devops Course Training in Chennai | Best Devops Training in Chennai

    ReplyDelete
  29. This comment has been removed by the author.

    ReplyDelete
  30. Packers Movers Pune
    This is a good blog. I also want to share some information about Expressrelocations. It is the company of packers and movers Pune.we provided the best service such as:
    Home Relocation
    Packing and Moving
    Car,Bike Transportation
    Office Moving
    Pet Relocation
    Warehousing
    International Shifting
    Insurance Coverage
    Packers Movers Pune

    Company Address:
    Address : Plot no. 86/A, Sector Number 23, Transport Nagar, Nigdi,
    Pune, Maharashtra 411044.
    Mobile No.: +91- 9527312244 / 8600402099 / 9923102244
    Email ID : info@expressrelocations.in
    Website : http://www.expressrelocations.in

    ReplyDelete
  31. Good Post! Thank you so much for sharing this pretty post, it was so good to read and useful to improve my knowledge as updated one, keep blogging.
    blue prism Training in Electronic City

    ReplyDelete
  32. Your info is really amazing with impressive content..Excellent blog with informative concept. Really I feel happy to see this useful blog, Thanks for sharing such a nice blog..
    If you are looking for any Data science Related information please visit our website Data science courses in Pune page!

    ReplyDelete
  33. It is amazing and wonderful to visit your site.Thanks for sharing this information,this is useful to me...
    Engineering Classes in Mumbai

    ReplyDelete
  34. Excellent Articles!!! Information are with unique content and it is very useful...Awaiting for your Feature posts...Big Thanks
    BEST JAVA TRAINING IN CHENNAI WITH PLACEMENT
    Java training in chennai | Java training in annanagar | Java training in omr | Java training in porur | Java training in tambaram | Java training in velachery

    ReplyDelete
  35. Your good knowledge and kindness in playing with all the pieces were very useful. I don’t know what I would have done if I had not encountered such a step like this.
    IELTS Coaching in chennai

    German Classes in Chennai

    GRE Coaching Classes in Chennai

    TOEFL Coaching in Chennai

    Spoken english classes in chennai | Communication training

    ReplyDelete


  36. Nice article and thanks for sharing with us. Its very informative



    Plots in THIMMAPUR

    ReplyDelete
  37. Communication is a two way process. If done properly, it gives excellent result. Thus opting for the best Integrated Marketing Communication Course on Talentedge is wise. To know more visit:

    ReplyDelete
  38. Learn Hadoop Training in Chennai for excellent job opportunities from Infycle Technologies, the best Big Data training institute in Chennai. Infycle Technologies gives the most trustworthy Hadoop training in Chennai, with full hands-on practical training from professional trainers in the field. Along with that, the placement interviews will be arranged for the candidates, so that, they can meet the job interviews without missing them. To transform your career to the next level, call 7502633633 to Infycle Technologies and grab a free demo to know more.Top Hadoop Training in Chennai | Infycle Technologies

    ReplyDelete
  39. One must contact vjescorts to get Escort Service in Bangalore. That's because we know that customer satisfaction is top-notch, and once you find our girls quite addictive, you're definitely going to come back to us again. https://www.vjescorts.com/Escort-service-in-Bangalore.php

    ReplyDelete
  40. Title:
    No.1 AWS Training Center in Chennai | Infycle Technologies

    Description:
    Learn Amazon Web Services for making your career as a shining sun with Infycle Technologies. Infycle Technologies is the best AWS training center in Chennai, providing complete hands-on practical training of professional specialists in the field. In addition to that, it also offers numerous programming language tutors in the software industry such as Oracle, Java, Python, AWS, Hadoop, etc. Once after the training, interviews will be arranged for the candidates, so that, they can set their career without any struggle. Of all that, 200% placement assurance will be given here. To have the best career, call 7502633633 to Infycle Technologies and grab a free demo to know more.

    Best training in Chennai

    ReplyDelete
  41. Java :
    If Java Development is a field that you're dreaming of, then we, Infycle, are with you to make your dream into reality. Infycle Technologies offers the best Java Training in Chennai, with various highly demanded software courses such as Big Data, AWS, Python, Hadoop, AWS, etc., in 100% practical training with specialized tutors in the field. Along with that, the pre-interviews will be given for the candidates to face the interviews with complete knowledge. To know more, dial 7502633633 for more.

    best training institute in chennai

    ReplyDelete
  42. Infycle Technologies, the best software training institute in Chennai offers the No.1 Python Certification in Chennai for tech professionals. Apart from the Python Course, other courses such as Oracle, Java, Hadoop, Selenium, Android, and iOS Development, Big Data will also be trained with 100% hands-on training. After the completion of training, the students will be sent for placement interviews in the core MNC's. Dial 7502633633 to get more info and a free demo.

    ReplyDelete