670

How to choose hardware for Big Data processing

Computer processing of information has been used for decades, but the term "big data" – Big Data – had only become widespread by 2011. Big data has enabled companies to quickly extract business value from a wide variety of sources, including social networks, geolocation data transmitted by phones and other roaming devices, publicly available information from the Internet, and sensor readings embedded in cars, buildings and other objects.

What is VVV model?

Analysts use the 3V / VVV model to define the essence of big data. The designation is an acronym for the three key principles of Big Data: volume, velocity, and variety, respectively.

  • Volume means that Big Data analyses large amounts of information – from 10TB.
  • Velocity means that information for Big Data is generated and changed very quickly (just think of the speed at which new hashtags spread on Twitter).
  • Variety means that data in multiple formats comes from multiple sources (e.g. text and video messages from social networks, readings from geolocation services).

Server for big data

Where Big Data is used

Big Data is arrays of diverse information that is often generated, updated and provided by multiple sources. This is used by modern companies to work more efficiently, create new products, and ultimately become more competitive. Big Data accumulates every second – even as you're reading this, someone is collecting information about your preferences and browsing activities. Most companies use Big Data to improve customer service, while others use it to improve operational data and predict risk.

For example, VISA uses Big Data to reduce fraudulent transactions, World of Tanks game developers use it to reduce gamer churn, the German Ministry of Labour uses it to analyse unemployment benefit applications, and major retailers compile large-scale marketing campaigns to sell as many products as possible.

What does working with Big Data look like?

It can be divided into the following stages:

  • Data collection. This can be open-source or internal. The former include: data from government services, publicly available commercial information, social networks, and online services. The latter are analytics, online transaction data). Standard application interfaces and protocols are used for transmission of information.
  • Data integration. Dedicated systems convert it into a format suitable for storage, or monitor it continuously for important triggers.
  • Processing and analysis. Operations are performed in real time, except when information is stored as functions for later processing. Popular analysis techniques: associative rule learning, classification, cluster and regression analysis, data mixing and integration, machine learning, pattern recognition and others.

An important element of working with Big Data is search, which allows you to get the information you need in different ways. In the simple case, it works in the same way as Google does. Data is available to internal and external parties for a fee or for free – it all depends on the terms of ownership. Big Data is in demand from app and service developers, trading companies and telecommunications companies. For business users, information is offered in a visualised, easy-to-understand form. If the format is text, it will be concise lists and excerpts, if it is graphical – diagrams, charts and animations.

Read also The Beginner's Guide to Web Hosting.

How to choose a platform for working with Big Data?

he handling of Big Data involves the use of a specific infrastructure focused on parallel processing and distributed storage of large volumes of data. But there is no one-size-fits-all solution for this purpose. Although a huge number of factors influence the choice of hardware, the only important factor is the software for Big Data collection and analysis. Accordingly, the process of purchasing hardware for a company will be as follows:

  • Choosing a Big Data software provider.
  • Researching the infrastructure requirements of the software developers.
  • Selection of hardware solutions based on these requirements.
  • Purchase of necessary hardware.

Thus, each project will be unique in its own way, and the equipment for its deployment will depend on the software chosen. Let's take for example two server solutions which are adapted to work with Big Data.

FUJITSU Integrated System PRIMEFLEX for Hadoop

This is a powerful and flexibly scalable platform designed for rapid analysis of large data sets of different types. It combines the advantages of a pre-configured hardware platform running on industry-standard components with dedicated open source software. The latter is provided by Cloudera and Datameer. The manufacturer guarantees the compatibility of the system components and its efficiency for complex analysis of structured and unstructured data. PRIMEFLEX for Hadoop is offered out-of-the-box, complete with business consulting services for Big Data, integration and maintenance.

FUJITSU Integrated System PRIMEFLEX for SAP HANA

This integrated system makes the most of SAP HANA. FUJITSU's PRIMEFLEX is suitable for storing and processing large amounts of data in RAM in real time. Calculations are performed both locally and in the cloud.

FUJITSU delivers PRIMEFLEX for SAP HANA in a comprehensive manner, with value-added services for all phases – from project decision and financing to ongoing operations. The product is based on components and technologies that have been certified for SAP. It covers different architectures, including previously configured, scalable system support, customised and virtualised VMware platforms.

Power BI is Microsoft's comprehensive business intelligence software, combining several software products that share a common technological and visual design, connectors, and web services. Power BI belongs to the class of self-service BI, and BI with resident computing. It is part of a single platform.

What is Power BI

How to work with Power BI and why?

Many people don't like analytics because they don't understand how to work with it and why. Today, using the example of Microsoft's Power BI system, we will tell you how knowing a simple analytics software can make life easier for any business. And it doesn't matter whether you are an analyst or a marketer.

What is Power BI anyway?

If you've ever needed to make a beautiful report, you know that it's very time-consuming. You have to find the data, analyse it, put it together and visualise it beautifully. To simplify the process and the life of marketers/analysts/entrepreneurs, Microsoft came up with Power BI.

This free software knows how to recognise and connect to more than 70 data sources. For example, xlsx, csv files, txt files, data from SQL databases. It can also clean up the data or process it and bring a million tabs into a single data model. Or you can define your own custom metrics that are used specifically in your company.

A major and huge plus of Power BI is that it allows you to make graphically beautiful and understandable reports. Options for any query – histograms, charts, tables, slices, cards, etc. All this can then be saved in a special cloud-based online Power BI Service and "finalize" the report together with your colleagues.

How does Power BI work?

Well, there are five components that make the system work:

  • "Get data" window
  • Power Query editor
  • links
  • data
  • reports

This is a kind of program algorithm. That is, first we need to get the necessary data in the window of the same name. This will open a window, where we need to select data for connection. You can pull them from the regular databases, such as MySQL, from Excel spreadsheets or from Internet resources like MailChimp, Facebook and others.

When we have selected the right one, two windows will appear: on the left you will see the previously selected parameters, on the right you will see the data itself. You can immediately click "upload" and start making reports. Or choose "edit", which will just open the Power Query editor.

The editor will appear as a separate window. In it we can organise everything that has come at us. At first glance, the editor window looks something like Word/Excel and other programmes: the toolbar at the top, all queries on the left, and the 'query parameters' window on the right. This window will display all the operations you have done with the data – deleting rows, renaming something.

Logically it is remotely similar to working with layers in Photoshop. In general, in the editor we can clean up, process, bring data back to the same look if it was from different sources, merge or split something.

The main working area with the data will be in the middle. Once you have generated all the queries, you need to click "save and apply". You will then return back to the working window and the programme will remember all the queries you have generated. Further, if you update the data, all manipulations with the initial data will take place automatically.

Power BI: Links

Next we proceed to the "links" mode. By the way, if the data has already been prepared, you can skip all the above steps and proceed directly to the links.

It's relatively simple – we can set links between the columns of different tables, form their orientation (one-way/bidirectional links), we can also connect multiple tables with each other. Here, of course, we need to learn the tools, so that the output is clear, precise and beautiful. Although the same goes for the editor tools.

Power BI: Data

Data mode is designed to allow you to augment your current data models with some kind of calculation – measures, tables, columns. An important point here is that all calculations are created with a string of formulas using a special language called DAX. This is a language of functions and formulas that Microsoft has developed for its products. You have probably come across it if you have ever worked with Excel.

Power BI: Reports

Finally, we have come to the most important thing: the "Reports" mode. This is where things get presentable and really clear. All of the report options are contained in the column "visualization". There is also a "filters" panel which allows you to filter some data from a certain page or level of the report.

Generally speaking, the "reports" mode is the simplest level that Power BI has. Here you simply drag and drop the graph you want into the report field or apply a filter.

Who can benefit from working with Power BI

In fact, there are plenty of options for whom knowledge of this programme can be useful. It is used by product analysts, SEO specialists, developers and testers. Power BI will be equally useful in an IT company as it is in e-commerce. After all, it is always better to rely on real figures to understand where to take a step for further development.

A minimum use case is to look at ready-made reports from colleagues to draw conclusions or to see the amount of current stock. The software has a real-time dashboard.

The marketer can look at the profitability of different sales channels in order to strengthen some of them or disable them altogether. By the way, Power BI can be connected to Google Analytics and see, for example, the number of visits to the website.

The salesperson can also navigate through reports to understand their effectiveness or to study data on new customers. Company managers basically need to look at and understand the reports in order to understand what is going on in general. By the way, reports can even be viewed from the app, handy when travelling on business.

Well, the creation of these reports can be done by anyone – the commercial director, the head of the sales department, etc. Of course, at a more in-depth and professional level, analysts do it.

To summarize

Power BI is a true savior in a world of enormous amounts of data that needs to be organized in a nice and clear way. Most importantly, you can do this with any type of data and bring it into a single view. Combine the report from Google Analytics and MySQL.

It's quite easy to use, so it's not just for analysts who want to learn its functionality. All of the reports generated may be stored in the cloud. This means they can be viewed at any time, anywhere, and conclusions can be drawn.

 

Probably, every systems or business analyst at some stage of his career thinks that it would be nice to get a professional certificate. There is a number of books dedicated to this topic – CBAP book and study materials are probably the best – but in this article I will try to answer the question – is it necessary and why?

What kind of certification is it?

There are several organizations in the world that allow business analysts to obtain certification and thereby confirm their professional level. I have looked at the most common organizations and certificates, namely:

International Institute of Business Analysis (IIBA). Offers certifications for analysts of all levels, from beginner ECBA to seasoned CBAP professionals.

Certified Analytics Professional (CAP) also offers two levels of certification.

The Project Management Institute (PMI) is best known for its Project Manager certifications. But they also offer PMI-PBA certification for business analysts.

The International Requirements Engineering Board (IREB) offers multiple levels of CPRE certification for requirements analysts, which is more suitable for IT analysts.

The International Qualification Board for Business Analysis (IQBBA) offers two levels of certification: entry-level analyst and advanced analyst.

Basically, these certificates are positioned for specialists in business analysis, but in our country they are also considered as confirmation of the level for system analysts, requirements analysts, software analyst, etc. There is a huge field for discussion, but I’m not talking about that.

Why is this needed?

I think that every analyst, having delved into the subject, will be able to find an answer to this question. I have considered three, the most frequently cited reasons, in principle, for any certification:

certified professionals earn more;
preparation for certification helps to organize your knowledge and allows you to identify gaps / gaps in the profession;

certification is such a way to prove to yourself that you are cool)

Let’s take a closer look. I launched a survey in communities and analyst chats and collected about four dozen responses about what analysts themselves think about this.

Do you recommend that business analysts get certified and why?

I recommend getting certified. But before that, the specialist should evaluate his capabilities and the very need to pass the exam for the standard in the current period of his career. Because it is important to clearly understand your expectations from certification: is this particular certificate suitable, what benefits it will bring in the work both to the specialist himself and to the employer. In addition, it is worth checking if there is an easier and faster way to get the expected benefits, and assess whether it will be possible to allocate time to prepare for certification, if it is still needed.

Certification is useful for consolidating knowledge and improving the quality of your artifacts in real work. In addition, reliance on theory helps to steer the discussion in a constructive direction, be it a speech at a conference or a meeting within the team. Another not obvious advantage of certification is the allocation of dark and light areas in their own competencies. That is, the standard helps to understand what the specialist has already succeeded in, and what needs to be improved, which topics are completely new. He can draw up an individual development plan in accordance with the standard, demonstrate his strengths to the manager and understand what new skills need to be mastered in order to request the corresponding tasks.

But you shouldn’t consider certification as a tool to raise wages in your current job. However, the specialist can agree with the manager about the payment or the allocation of working hours to prepare for the exam. From the point of view of the manager, the presence of a certificate will not be decisive, but it will definitely distinguish the employee among hundreds of others.

What methods of preparation for exams would you recommend?

Any theory will vanish if it is not worked out in practice. There is no need to memorize anything – it will be useless both for passing the exam and for your development. To prepare, a specialist should consistently study the standard and immediately look for tasks for applying the knowledge gained. And in case of any doubt, whether he is doing the right thing or how best to act in a situation, refer to the standard as a reference, find the necessary information and apply it.

A community of like-minded people also helps with the preparation, who also began to prepare for the certificate you have chosen. If no one is around, then create this movement yourself within the company or region. Lively discussion and outside opinions will have a beneficial effect on both motivation and the end result.