||Big Data Analysis Interview with Steve Harris Chief Technology Officer at Garlik, an Experian Company
Big Data Analysis Interview with Steve Harris Chief Technology Officer at Garlik, an Experian Company
Keywords: big data analysis, chief technology officer, experian, garlik, steve harris
||Big Data Analysis Interview with Ricardo Baeza-Yates VP of Research for Europe and Latin America at Yahoo!
The main theme for Ricardo was that he thinks two areas to invest in are: a) what he called Hadoop++ the ability to handle graphs with trillions of edges as MapReduce doesn't scale well for graphs; and b) stream data mining - the ability to handle streams of large volumes of data. Handling lots of data in a 'reasonable' amount of time is key for Ricardo - for example, being able to carry out offline computations within a week rather than a year. He was also very interested in personalisation and its relation to privacy. Rather than personalising based on user data we should personalise around user tasks.
Keywords: big data analysis, ricardo baeza yates, vp research, yahoo!
||Big Data Analysis Interview with Bill Thompson Head of Partnership Development, BBC Archives
According to Bill Thompson, the term Big Data as a general label is to be viewed sceptically, as to his mind there is nothing fundamentally new in computer science terms.However, he does agree that there are certain technologies that should be invested in. Especially the EU should invest from a public service point of view to counteract large companies that will focus purely on areas for profit.
He also provided two UK-related analogies that should be avoided: Firstly, UK schools having to suffer in computer science education because MS Office was adopted and secondly big pharma not investing in cures for Malaria.
Furthermore, he thinks that it is very important that the EU invests from a public service point of view to counteract the big companies that will focus purely on areas for profit. He gave 2 analogies that we might want to avoid: UK schools having to suffer in computer science education because MS Office was adopted; and big pharma not investing in cures for Malaria.
Bill Thompson, Head of the Partnership Development within the BBC Archives Development group, is an English technology journalist, commentator and writer, best known for his weekly column in the Technology section of BBC News Online and his appearances on Click, a radio show on the BBC World Service.
Keywords: bbc, big data analysis, bill thompson
||Big Data Analysis Interview with Peter Mika Researcher at Yahoo!
The main theme for Peter was on using a machine learning, information extraction and semantic web technologies to reduce Big Data into more manageable chunks and how combined with new programming paradigms such as Hadoop we could now accomplish more. Background knowledge (in a simple form) enables Yahoo1 to understand that "Brad Pitt Fight Club" is a search for a movie with Brad Pitt playing the role of disambiguation.
Peter Mika is a researcher working on the topic of semantic search at Yahoo Research in Barcelona, Spain. Peter also serves as a Data Architect for Yahoo Search, advising the team on issues related to knowledge representation. Peter is a frequent speaker at events, a regular contributor to the Semantic Web Gang podcast series and a blogger at tripletalk.wordpress.com.
Keywords: big data analysis, peter mika, researcher, yahoo!
||Big Data Analysis Interview with Alon Halevy Research Scientist at Google
In the interview Alon drew upon his work on Google Fusion Tables which allows users to upload and store their datasets. A collection of technologies which are not necessarily new but are now beginning to work at scale are having an impact. These include: reconciling entities (saying X and Y are the same thing), resolving schema and ontology differences, extracting high quality data from the web, large ontology based datasets (typically built from wikipedia), crowd sourcing computation.
Alon Halevy leads the Structured Data research team at Google Research. Prior to that, Alon was a professor of Computer Science at the University of Washington, where he started UW CSE Database Group in 1998, and worked in the field of data integration and web data.
Keywords: alon halevy, big data analysis, google, research scientist
||Big Data Analysis Interview with Jeni Tennison Technical Director of the Open Data Institute
Jeni discussed how open data can be found and combined to serve decision making. A key technology of interest, pointed out by Jeni, was discovery of datasets that are distributed in the internet and tools that allow achieving this in an automated way.
Within the wider UK public sector, Jeni Tennison worked on the early linked data work on data.gov.uk, helping to engineer new standards for the publication of statistics as linked data; building APIs for geographic, transport and education data; and supporting the publication of public sector organograms as open data. She continues her work within the UK's public sector as a member of both the UK Government Linked Data Group and the Open Data User Group.
Keywords: big data analysis, jeni tennison, open data institute, technical director
||Big Data Analysis Interview with Hjalmar Gislason and Edward Farmer of DataMarket.com
Hjalmar Gislason is the founder and CEO of DataMarket.com. In this Interview Hjalmar Gislason covers the area of Data Visualization and Data Modelling via semantics. He believes the simplicity of use to be crucial to success and that lot of technologies like the Semantic Web Stack are over engineered. According to him there is a high demand for "democratization of semantic technologies" - making everything accessible through a web browser and dealing with legacy versions of IE.
DataMarket helps business users find and understand data, and data providers to efficiently publish their data and reach new audiences. DataMarket.com provides access to thousands of data sets holding hundreds of millions of facts and figures from a wide range of public and private data providers including the United Nations, the World Bank, Eurostat and the Economist Intelligence Unit. The data portal allows this data to be searched, visualized, compared and downloaded in a single place in a standard, unified manner.
Keywords: big data analysis, datamarket.com, edward farmer, hjalmar gislason
||Big Data Analysis Interview with Andraz Tori Founder and CTO of Zemanta
In this interview Andraz mainly covers Hadoop framework, explains why it was successful and provides interesting remarks on why the US seems to do better than Europe in Big Data technologies at the moment.
Andraz Tori is a CTO and co-founder of Zemanta, a 5-years old company dealing with semantic analysis of text for the purpose of having a personal writing assistant and general purpose recommendations. In terms of Big Data Andraz characterizes Zemanta as a “small data” inside Big Data. The company operates in terabytes of compressed data, running CPU intensive operations.
Keywords: andraz tori, big data analysis, zemanta
||Big Data Analysis Interview with Jim Webber Chief Scientist at Neo
In this interview Jim covered graph databases which are used by the major Web players such as Facebook and Google. Graph databases are making a comeback because the nature of data that we see has changed. Data is sparse, not uniform, heterogeneous and well connected.
This technology will have a major impact horizontally. As graph technology matures retail will become more finely tuned to personal circumstances and fraud detection will be easier.
Neo Technology brings the power of graph databases to a wide variety of clients. Neo4J, the world’s leading graph database, has the largest ecosystem of partners and developers and tens of thousands of successful deployments. From websites adding social capabilities to Telco’s providing personalized customer services to innovative bioinformatics research, organizations adopt graph databases to quickly model and query connected data.
Keywords: big data analysis, jim webber, neo
||Big Data Analysis Interview with Francois Bancilhon CEO of Data Publica
In this interview Francois said that a priority need for Data Publica was to have good cheap access to up-to-date snapshots of the Web. He also talked about the need for performance improvements to information extraction tools and how public sector bodies such as the commission could aid in the provision of standard named entity repositories (e.g. all the companies in a country or Europe).
Data Publica is a young French startup, which includes a group of data developers responsible for producing data sets (custom datasets based on the customer specifications and off-the-shelf datasets based on market demands). Producing data implies the identification of the sources of data; extracting the data; turning the raw data into the structured data; and, finally, delivering it to the customer.
Keywords: big data analysis, data publica, francois bancilhon