Rechercher dans ce blog

Affichage des articles dont le libellé est Koelner R User. Afficher tous les articles
Affichage des articles dont le libellé est Koelner R User. Afficher tous les articles

lundi 3 mars 2014

Review: K�lner R Meeting 26 Feburary 2014

Last week's Cologne R user group meeting was all about R and databases. We had three talks from a generic overview on how to connect R to databases, to a specific example with kdb+ and perhaps the future with ArangoDB, a NoSQL database.

Connecting R with databases

Diego de Castillo's talk focused on the use of relational databases, such as PostgreSQL, SQLite and Oracle. For all these databases dedicated R drivers exist on CRAN that can be used in a generic way via the DBI package. This allows for a consistent approach to connect, query and return data to R. A popular alternative on Windows to the DBI framework is the use of the ODBC (Open Database Connectivity) API via RODBC or RJDBC.


R and kdb+

Kim Kuen Tang gave an overview of kdb+, a proprietary database that appears to be popular for time series data. kdb+ comes with its own expressive query language, q. Kim demonstrated how he could analyse large amount of stock market data stored in a kdb+ database using R and q all via sublime.

ArangoDB

Michael Hackstein and Claudius Weinberger introduced us to ArangoDB, a NoSQL (Not only SQL) database. ArangoDB is an open source document database. This means that data is stored as documents, which are similar to JavaScript objects, in so-called "collections". Their slides presented nicely the different concepts outside the traditional relational databases, such as key values stores, document stores and graph data. Claudius mentioned that they had received several requests from users who wanted to connect R to ArangoDB. Although a native driver does not exist for R yet, ArangoDB can be accessed by R using the HTTP-API via the packages bitops, RCurl and RJSONIO.


Next K�lner R meeting

The next meeting is scheduled for 23 May 2014. This will be our 10th meeting, clearly something we need to celebrate!

Please get in touch if you would like to present and share your experience, or indeed if you have a request for a topic you would like to hear more about. For more details see also our Meetup page.

Thanks again to Bernd Wei� for hosting the event and Revolution Analytics for their sponsorship.

Review: K�lner R Meeting 26 Feburary 2014

Last week's Cologne R user group meeting was all about R and databases. We had three talks from a generic overview on how to connect R to databases, to a specific example with kdb+ and perhaps the future with ArangoDB, a NoSQL database.

Connecting R with databases

Diego de Castillo's talk focused on the use of relational databases, such as PostgreSQL, SQLite and Oracle. For all these databases dedicated R drivers exist on CRAN that can be used in a generic way via the DBI package. This allows for a consistent approach to connect, query and return data to R. A popular alternative on Windows to the DBI framework is the use of the ODBC (Open Database Connectivity) API via RODBC or RJDBC.


R and kdb+

Kim Kuen Tang gave an overview of kdb+, a proprietary database that appears to be popular for time series data. kdb+ comes with its own expressive query language, q. Kim demonstrated how he could analyse large amount of stock market data stored in a kdb+ database using R and q all via sublime.

ArangoDB

Michael Hackstein and Claudius Weinberger introduced us to ArangoDB, a NoSQL (Not only SQL) database. ArangoDB is an open source document database. This means that data is stored as documents, which are similar to JavaScript objects, in so-called "collections". Their slides presented nicely the different concepts outside the traditional relational databases, such as key values stores, document stores and graph data. Claudius mentioned that they had received several requests from users who wanted to connect R to ArangoDB. Although a native driver does not exist for R yet, ArangoDB can be accessed by R using the HTTP-API via the packages bitops, RCurl and RJSONIO.


Next K�lner R meeting

The next meeting is scheduled for 23 May 2014. This will be our 10th meeting, clearly something we need to celebrate!

Please get in touch if you would like to present and share your experience, or indeed if you have a request for a topic you would like to hear more about. For more details see also our Meetup page.

Thanks again to Bernd Wei� for hosting the event and Revolution Analytics for their sponsorship.

mardi 17 décembre 2013

Review: K�lner R Meeting 13 December 2013

Last week's Cologne R user group meeting was the best attended so far. Well, we had a great line up indeed. Matt Dowle came over from London to give an introduction to the data.table package. He was joined by his collaborator Arun Srinivasan, who is based in Cologne. Their talk was followed by Thomas Rahlf on Datendesign mit R (Data design with R).

data.table


Download slides

Matt's goal with the data.table package is to reduce times; time to write code and to execute code. His talk illustrated how the syntax of data.table, not unlike SQL, can produce shorter and more readable code that at the same time provides an efficient and fast way to analyse big in memory data sets with R. Arun presented on new developments in data.table 1.8.11, which not only fixes bugs but adds many new features such as melt/cast and further speed gains.

I said early that data.table rocks. For more details see the data.table home page.

Data design with R


Thomas Rahlf: Datendesign mit R

Thomas Rahlf talked about his forthcoming book Datendesign mit R (Data design with R). He shared with us his motivations and aims for the book. In his opinion there are many books that present beautiful charts and concepts (e.g. Tufte's books), but then don't show how they can be reproduced, as there are often done with software such as Adobe Illustrator. Or books explain the graphical functions of a software, yet fail to demonstrate how to create beautiful charts with them. Thus, Thomas' book will contain 100 examples demonstrating that desktop publishing quality charts can be produced with R and in some cases with the help of LaTeX. Indeed, all examples have about 40 lines of code and use the base R graphics system only and not grid or any add-ons such as lattice or ggplot2.

The book's accompanying web site gives you a taster already. The book itself will be published by Open Source Press next month.

The Schnitzel

Of course the evening ended with Schnitzel and K�lsch at the Lux.

The Luxus Schnitzel. Photo by G�nter Faes

Next K�lner R meeting

The next meeting is scheduled for 26 February 2013 (Wednesday before Altweiber), with two talks by Diego de Castillo (Connecting R with databases) and Kim Kuen Tang (R and kdb+).

Please get in touch if you would like to present and share your experience, or indeed if you have a request for a topic you would like to hear more about. For more details see also our Meetup page.

Thanks again to Bernd Wei� for hosting the event and Revolution Analytics for their sponsorship.

Review: K�lner R Meeting 13 December 2013

Last week's Cologne R user group meeting was the best attended so far. Well, we had a great line up indeed. Matt Dowle came over from London to give an introduction to the data.table package. He was joined by his collaborator Arun Srinivasan, who is based in Cologne. Their talk was followed by Thomas Rahlf on Datendesign mit R (Data design with R).

data.table


Download slides

Matt's goal with the data.table package is to reduce times; time to write code and to execute code. His talk illustrated how the syntax of data.table, not unlike SQL, can produce shorter and more readable code that at the same time provides an efficient and fast way to analyse big in memory data sets with R. Arun presented on new developments in data.table 1.8.11, which not only fixes bugs but adds many new features such as melt/cast and further speed gains.

I said early that data.table rocks. For more details see the data.table home page.

Data design with R


Thomas Rahlf: Datendesign mit R

Thomas Rahlf talked about his forthcoming book Datendesign mit R (Data design with R). He shared with us his motivations and aims for the book. In his opinion there are many books that present beautiful charts and concepts (e.g. Tufte's books), but then don't show how they can be reproduced, as there are often done with software such as Adobe Illustrator. Or books explain the graphical functions of a software, yet fail to demonstrate how to create beautiful charts with them. Thus, Thomas' book will contain 100 examples demonstrating that desktop publishing quality charts can be produced with R and in some cases with the help of LaTeX. Indeed, all examples have about 40 lines of code and use the base R graphics system only and not grid or any add-ons such as lattice or ggplot2.

The book's accompanying web site gives you a taster already. The book itself will be published by Open Source Press next month.

The Schnitzel

Of course the evening ended with Schnitzel and K�lsch at the Lux.

The Luxus Schnitzel. Photo by G�nter Faes

Next K�lner R meeting

The next meeting is scheduled for 26 February 2013 (Wednesday before Altweiber), with two talks by Diego de Castillo (Connecting R with databases) and Kim Kuen Tang (R and kdb+).

Please get in touch if you would like to present and share your experience, or indeed if you have a request for a topic you would like to hear more about. For more details see also our Meetup page.

Thanks again to Bernd Wei� for hosting the event and Revolution Analytics for their sponsorship.

lundi 21 octobre 2013

Review: K�lner R Meeting 18 October 2013

The Cologne R user group met last Friday for two talks on split apply combine in R and XLConnect by Bernd Wei� and G�nter Faes respectively, before the usual Schnitzel and K�lsch at the Lux.

Split apply combine in R




The apply family of functions in R is incredible powerful, yet for newcomers often somewhat mysterious. Thus, Bernd gave an overview of the different apply functions and their cousins. The various functions differ in their object inputs, e.g. vectors, arrays, data frames or lists, and their outputs. Other related functions are by, aggregate and ave. While functions like aggregate reduce the output size, others like ave will return as many rows as the input object and repeat the results where necessary.

Alternatively to the base R function Bernd touched also on the **ply functions of the plyr package. The function names are certainly easier to remember, but their syntax can be a little awkward (.()). Bernd's slides, in German, are already available from our Meetup site.

XLConnect

When dealing with data stored in spreadsheets most member of the group rely on read.csv and write.csv in R. However, if you have a spreadsheet with multiple tabs and formatted numbers, read.csv becomes clumsy, as you would have to save each tab without any formatting in separate files.

G�nter presented the XLConnect as an alternative to read.csv or indeed RODBC for reading spreadsheet data. It uses the Apache POI API as the underlying interface. XLConnect requires a Java runtime environment on your computer, but no installation of Excel. That makes it a true platform independent solution to exchange data with spreadsheets and R. Not only can you read defined rows and columns from Excel into R, or indeed named ranges, but in the same way data can be stored in Excel files again and to top it all - also graphic output from R.

Next K�lner R meeting

The next meeting is scheduled for 13 December 2013. A discussion of the data.table package is already on the agenda.

Please get in touch if you would like to present and share your experience, or indeed if you have a request for a topic you would like to hear more about. For more details see also our Meetup page.

Thanks again to Bernd Wei� for hosting the event and Revolution Analytics for their sponsorship.

Review: K�lner R Meeting 18 October 2013

The Cologne R user group met last Friday for two talks on split apply combine in R and XLConnect by Bernd Wei� and G�nter Faes respectively, before the usual Schnitzel and K�lsch at the Lux.

Split apply combine in R




The apply family of functions in R is incredible powerful, yet for newcomers often somewhat mysterious. Thus, Bernd gave an overview of the different apply functions and their cousins. The various functions differ in their object inputs, e.g. vectors, arrays, data frames or lists, and their outputs. Other related functions are by, aggregate and ave. While functions like aggregate reduce the output size, others like ave will return as many rows as the input object and repeat the results where necessary.

Alternatively to the base R function Bernd touched also on the **ply functions of the plyr package. The function names are certainly easier to remember, but their syntax can be a little awkward (.()). Bernd's slides, in German, are already available from our Meetup site.

XLConnect

When dealing with data stored in spreadsheets most member of the group rely on read.csv and write.csv in R. However, if you have a spreadsheet with multiple tabs and formatted numbers, read.csv becomes clumsy, as you would have to save each tab without any formatting in separate files.

G�nter presented the XLConnect as an alternative to read.csv or indeed RODBC for reading spreadsheet data. It uses the Apache POI API as the underlying interface. XLConnect requires a Java runtime environment on your computer, but no installation of Excel. That makes it a true platform independent solution to exchange data with spreadsheets and R. Not only can you read defined rows and columns from Excel into R, or indeed named ranges, but in the same way data can be stored in Excel files again and to top it all - also graphic output from R.

Next K�lner R meeting

The next meeting is scheduled for 13 December 2013. A discussion of the data.table package is already on the agenda.

Please get in touch if you would like to present and share your experience, or indeed if you have a request for a topic you would like to hear more about. For more details see also our Meetup page.

Thanks again to Bernd Wei� for hosting the event and Revolution Analytics for their sponsorship.