Difference between revisions of "Main Page"

From MonetDB
Jump to navigationJump to search
(→‎Getting started with MonetDB: Adding an entry for Building from sources on OS X)
 
(139 intermediate revisions by 21 users not shown)
Line 1: Line 1:
'''MediaWiki has been successfully installed.'''
 
  
Consult the [http://meta.wikimedia.org/wiki/Help:Contents User's Guide] for information on using the wiki software.
+
== What is this "wiki" thing? ==
  
== Getting started with the wiki==
+
If you're new to Wiki's, please read a [http://wiki.org/wiki.cgi?WhatIsWiki brief explanation of what they are]. You might then want to continue to this short
* [http://www.mediawiki.org/wiki/Manual:Configuration_settings Configuration settings list]
+
[http://www.youtube.com/watch?v=-dnL00TdmLY introduction video].
* [http://www.mediawiki.org/wiki/Manual:FAQ MediaWiki FAQ]
+
 
 +
We use this Wiki for easly-updatable, easily-expandable and user-editable MonetDB-related content, as well as for some coordination within our group.
  
 +
Consult the [http://meta.wikimedia.org/wiki/Help:Contents MediaWiki User Guides] page for detailed guides on using various aspects of MediaWiki (the platform on which this Wiki is built).
 +
 
== Getting started with MonetDB ==
 
== Getting started with MonetDB ==
 
* [[MonetDB:Getting started]]
 
* [[MonetDB:Getting started]]
* [[MonetDB:Building from sources]]  
+
* [[MonetDB:Building from sources]]
* [[MonetDB:Building from sources on OS X]]  
+
* [[MonetDB:Building from sources on OS X]]
 +
* [[MonetDB:Installing on OS X]]
 
* [[MonetDB:Various tips]]
 
* [[MonetDB:Various tips]]
  
 
== Internal ==
 
== Internal ==
 +
=== Software development ===
 +
* [[Development policy]]
 +
* [[Nightly testing policy]]
 +
* [[MonetDB type system]]
  
 
=== Project related ===
 
=== Project related ===
Line 19: Line 26:
 
* [[MonetDB RDF]]
 
* [[MonetDB RDF]]
 
* [[BitWise Decomposition]]
 
* [[BitWise Decomposition]]
 +
* [[Information of Vera's Internship Summer 2015 | Vera2015]]
  
 
=== Organization ===
 
=== Organization ===
Line 25: Line 33:
 
* [[TTT|Thursday Think Tank]]
 
* [[TTT|Thursday Think Tank]]
 
* [[Conferences]]
 
* [[Conferences]]
 +
* [[Astronomy: Bulk Source Association]]
  
 +
== SciLens cluster ==
  
=== SciLens cluster hardware ===
 
 
The SciLens cluster has been acquired to provide a sizable and flexible experimentation platform
 
The SciLens cluster has been acquired to provide a sizable and flexible experimentation platform
 
for the Database Architectures group at CWI. Access is granted to members of the DA group.
 
for the Database Architectures group at CWI. Access is granted to members of the DA group.
 +
 +
=== SciLens usage policy ===
  
 
In case you are cooperating with non-DA-members that are (temporarily)
 
In case you are cooperating with non-DA-members that are (temporarily)
Line 37: Line 48:
 
policies (see above) and (2) introduce them and their work to the group
 
policies (see above) and (2) introduce them and their work to the group
  
If you plan to use machines in our SciLens cluster
+
'''''If you plan to use machines in our SciLens cluster please make sure you report your usage (plans) and claim machines the usual way [[SciLens_cluster_use|via the dedicated wiki pages]]. Please do not forget to release them, again, once you're done with using them! Do not hesitate to ask, in case you have any questions!'''''
please make sure you report your usage (plans) and claim
+
 
machines the usual way via the webpages below.
+
=== SciLens backup ===
Do not hesitate to ask, in case you have any questions!
+
 
 +
The data on the SciLens machines is '''not backed-up''' by ITF. To ensure a continual use of your data, you either have to find a backup storage place within the cluster, your desktop, external, or be able to re-generate your settings. In case of doubt or advice where and how to make your backup, contact our system administrator, [mailto:Arjen.de.Rijke@cwi.nl Arjen de Rijke].
 +
 
 +
=== Logging into the cluster ===
 +
 
 +
==== Your first login ====
 +
 
 +
Logging into the cluster is regulated through the scilens2-ssh (virtual) machine.
 +
Initially, you _cannot_ access it; please ask [mailto:Arjen.de.Rijke@cwi.nl Arjen] for access. With his help you'll obtain a special license file for use with the cluster, which you can place  at <tt>$HOME/.ssh/id_rsa_scilens</tt> on your machine.
 +
 
 +
Thereafter you can SSH to the <tt>scilense2-ssh</tt> gateway machine:
 +
 
 +
[someone@somedesktop ~]$ ssh -A -i $HOME/.ssh/id_rsa_scilens scilens2-ssh.da.cwi.nl
 +
 
 +
and subsequently move to the specific machine you desire to use, e.g.
 +
 
 +
[someone@scilens2-ssh ~]$ ssh bricks09
 +
 
 +
 
 +
Make sure you can ssh into the cluster via the gateway machine. On the gateway machine a key has been created for you, probably named: <tt>.ssh/id_rsa_scilens_<span style="font-family: sans-serif; color: Blue">your-username</span></tt>. This key should be copied to the desktop .ssh directory.
 +
 
 +
Add this information to the config file <tt>$HOME/.ssh/config</tt> on your CWI desktop machine:
 +
 
 +
Host scilens2-ssh.da.cwi.nl
 +
        User <span style="font-family: sans-serif; color: Blue">your-username</span>
 +
        IdentityFile=/home/<span style="font-family: sans-serif; color: Blue">your-username</span>/.ssh/id_rsa_scilens
 +
        ForwardAgent=yes
 +
Host bricks* rocks* pebbles2* diamonds* stones* gems*
 +
        User <span style="font-family: sans-serif; color: Blue">your-username</span>
 +
        IdentityFile=/home/<span style="font-family: sans-serif; color: Blue">your-username</span>/.ssh/id_rsa_scilens_<username>
 +
        ProxyCommand ssh scilens2-ssh.da.cwi.nl -W %h:%p
 +
 
 +
Note that technically there are 3 different usernames involved: On your own machine, on the gateway machine and on the target machine in the cluster; if for some reason they're not all the same, for each machine's entry, the  <span style="font-family: sans-serif; color: Blue">your-username</span> is the one on that machine.
  
Logging into the cluster is regulated through scilens-ssh !
+
You will now be able to issue an SSH command directly:
You need to patch you local ssh config file to use the scilens-ssh machine as a forward device.
 
For example,.....
 
  
Furthermore, be aware that you land on the home directory of the SciLens machines.
+
[someone@somedesktop ~]$ ssh bricks09
For most practical cases, you should make yourself a directory in /scratch where there should be ample of disk space.
 
  
 +
and the connection will be automatically proxied over an SSH connection to the gateway machine.
 +
 +
'''Notes:'''
 +
* Remember to make <tt>$HOME/.ssh/config</tt> readable only by you if you've just created it.
 +
* Add the keys to your key chain using ssh-add .
 +
* This might conflict with other hostnames beginning with <tt>bricks</tt>, <tt>rocks</tt> or <tt>pebbles2</tt>.
 +
 +
=== Disk usage on the cluster  ===
 +
 +
When you login to one of the SciLens machines, you land in your home directory on that machine. But - try to avoid using this directory: It is mounted on a '''small disk''', shared by all users, which is
 +
deleted when the operating system is reinstalled. Restrict yourself in using it to configuration settings files, symlinks etc.
 +
 +
For most practical cases, you should make yourself a directory in <tt>/scratch</tt> or <tt>/data</tt> where there should be ample of disk space.
 +
 +
=== Copying data in and out of the cluster ===
 +
 +
As explained above, all communication into into the cluster is mediated by ssh through the scliens-ssh virtual machine, and this is specifically true when you want to send files in. Fortunately, if you've configured [#Automation of proxying|automatic proxying], all SSH-related utilities will work. You can thus push and pull files from the outside:
 +
 +
[someone@somedesktop ~]$ scp foo bricks09:/scratch/someone/bar
 +
[someone@somedesktop ~]$ scp bricks09:/scratch/someone/bar foo
 +
 +
What about pushing and pulling from the inside? Well, it seems [?] you're limited to using key files for authentication going out of the cluster to our desktop machines, so you should generate yourself an (RSA) key pair on the cluster machine, e.g.
 +
 +
[someone@bricks09 ]$ ssh-keygen -t rsa
 +
 +
and then you can do
 +
 +
[someone@bricks09 /scratch/someone]$ scp foo somedesktop.ins.cwi.nl:bar
 +
[someone@bricks09 /scratch/someone]$ scp somedesktop.ins.cwi.nl:bar foo
 +
 +
as well.
 +
 +
Note: if you don't provide <tt>scp</tt> with a path for the local file, it will use the current work directory (<tt>$PWD</tt>) is used; but if you don't specify a path on the remote machine, and just write <tt>machine:filename</tt>, you'll be referring to your home directory on that machine).
 +
 +
=== Copying & synchronizing data within the cluster ===
 +
 +
Copying files within the cluster is straightforward using the scp command identifying the machine
 +
and directory locations involved, e.g. while on bricks09 you can move the file data to bricks10
 +
 +
scp data bricks10:/scratch/<b>your_username</b>
 +
 +
You can also use the rsync command to clone your environment easily on multiple machines.
 +
It requires a .rsync file in your homedirectory with directives. Create a rsync daemon configurationfile in your <tt>/scratch/<b>your_username</b></tt> directory containing:
 +
 +
port = 2873
 +
use chroot = no
 +
 +
[scratch]
 +
path = /scratch/<b>your_username</b>
 +
 +
The port number is free to choose above 1000, otherwise conflicts with others users might occur.
 +
Let's assume again you are on <tt>brick09</tt> and you want a synchronized copy on <tt>bricks10</tt>.
 +
On <tt>bricks09</tt> create the above configuration file and start the <tt>rsync</tt> daemon with the command:
 +
 +
$ rsync --daemon --config=/scratch/<b>your_username</b>/rsyncd.conf
 +
 +
Then login onto bricks10 in your /scratch/<b>your_username</b> directory execute the command:
 +
 +
$ rsync -aH bricks09:/scratch/<b>your_username</b>/* /scratch/<b>your_username</b>/
 +
 +
For further details see the <tt>rsync</tt> manual or contact the expert. After the <tt>rsync</tt> is complete you should terminate the <tt>rsync</tt> daemon on <tt>bricks09</tt>.
 +
 +
Returning small files from the SciLens machine to your desktop can be performed using scp. For big ones contact the expert.
 +
Importing large amounts of data into the cluster, contact the expert.
 +
 +
=== Printing ===
 +
Files to be printed should be sent to the CWI print spooler and a specific printer, e.g.
 +
 +
lpr -H spool.cwi.nl -Ppear <files>
 +
 +
=== Accessing the internet ===
 +
You can use <tt>wget</tt> to download any page or file available on the WWW from within the cluster.
 +
 +
You have to contact the expert if you want to run a web-client/server setup on the machine, as this will require a custom SSH tunneling setup.
 +
 +
=== Root-specific features ===
 +
 +
Some tools have been put in place to make your life easier with respect to performance analysis, which require root privileges to use. These have to be called with the <tt>sudo</tt> command, e.g.:
 +
 +
$ sudo iotop
 +
 +
=== Pre-installed user requested packages ===
 +
 +
External software that requires root permissions can only be installed from source in your local environment.
 +
For example, postgresql, mysql and friends.
 +
 +
Non-installed libraries available within the Fedora distributions can be enabled by contacting the expert.
 +
 +
Specific application frameworks, e.g. java, can be installed from the Fedora repository, but due to versioning issues we advice to use a local copy as much as possible.
 +
When in doubt follow the expert route.
 +
 +
=== Work with multiple machines ===
 +
 +
With the command <tt>clush</tt>, one can execute the same command on multiple SciLens machines simultaneously. Assume you are logged in scilens-ssh, then run:
 +
 +
$ clush -w bricks[01-16]
 +
 +
This will give you a prompt. Any command you type in here, e.g., 'df -h /scratch', will be run on the machines <tt>bricks01</tt> through <tt>bricks16</tt>. Try it out to see the results produced by the selected bricks machines.
 +
 +
The <tt>-w</tt> option allows you to pass a list of machines you want to work on.
 +
 +
In theory, you can address all rocks machines with one clush command: <tt>clush -w rocks[001-144]</tt>. However, in practice, it is advisable to work with smaller groups of machines, say, 20. This also makes it easier to cancel a command if one of the machines freezes.
 +
 +
With <tt>quit</tt> you can stop clush. For more information, please see the man page of clush.
 +
 +
=== SciLens cluster hardware===
 
* Standard hardware
 
* Standard hardware
** [[scilens-configuration-standard|all tiers (stones, bricks, rocks, pebbles)]]
+
** '''[[scilens-configuration-standard|all tiers (diamonds, stones2, gems, stones, bricks, rocks2, emeralds, pebbles2, jewels, (rocks+, rocks, pebbles))]]'''
  
 
* Non-standard hardware
 
* Non-standard hardware
 +
** diamonds
 +
** stones2
 
** stones
 
** stones
** [[bricks]]
+
** '''[[bricks]]'''
** rocks
+
** rocks2
** pebbles
+
** emeralds
 +
** pebbles2
 +
** (gems)
 +
** (rocks+)
 +
** (rocks)
 +
** (pebbles)
  
 
=== SciLens cluster use ===
 
=== SciLens cluster use ===
* '''''2014'''''
 
** ''Mar. 2014'': [[SciLens cluster use Mar 2014 (stones)|stones]], [[SciLens cluster use Mar 2014 (bricks)|bricks]], [[SciLens cluster use Mar 2014 (rocks)|rocks]], [[SciLens cluster use Mar 2014 (pebbles)|pebbles]]
 
** ''Feb. 2014'': [[SciLens cluster use Feb 2014 (bricks)|bricks]], [[SciLens cluster use Feb 2014 (rocks)|rocks]], [[SciLens cluster use Feb 2014 (pebbles)|pebbles]]
 
** ''Jan. 2014'': [[SciLens cluster use Jan 2014 (bricks)|bricks]], [[SciLens cluster use Jan 2014 (rocks)|rocks]], [[SciLens cluster use Jan 2014 (pebbles)|pebbles]]
 
**(''templates'': [[SciLens cluster use template (stones)|stones]], [[SciLens cluster use template (bricks)|bricks]], [[SciLens cluster use template (rocks)|rocks]], [[SciLens cluster use template (pebbles)|pebbles]])
 
  
* '''''2013'''''
+
Visit the '''[[SciLens_cluster_use|SciLens Cluster Use]]''' page to see which machines are available for use and to register your own intended use of the cluster.
** [[SciLens cluster use December 2013|December 2013]]
+
 
** [[SciLens cluster use November 2013|November 2013]]
+
'''Please''' do '''always claim''' machines using the registration mechanism '''before''' you start actually using them! And remember to '''release''' machines '''once you're done''' using them!
** [[SciLens cluster use October 2013|October 2013]]
+
 
** [[SciLens cluster use September 2013|September 2013]]
+
== Additional resources for messing around with this wiki ==
** [[SciLens cluster use August 2013|August 2013]]
+
 
** [[SciLens cluster use July 2013|July 2013]]
+
* [http://www.mediawiki.org/wiki/Manual:Configuration_settings Configuration settings list]
** [[SciLens cluster use June 2013|June 2013]]
+
* [http://www.mediawiki.org/wiki/Manual:FAQ MediaWiki FAQ]
** [[SciLens cluster use May 2013|May 2013]]
 
** [[SciLens cluster use April 2013|April 2013]]
 
** [[SciLens cluster use March 2013|March 2013]]
 
** [[SciLens cluster use February 2013|February 2013]]
 
** [[SciLens cluster use January 2013|January 2013]]
 
** [[SciLens cluster use Dec 2012 - Jan 2013|Dec 2012 - Jan 2013]]
 

Latest revision as of 23:05, 22 November 2020

What is this "wiki" thing?[edit]

If you're new to Wiki's, please read a brief explanation of what they are. You might then want to continue to this short introduction video.

We use this Wiki for easly-updatable, easily-expandable and user-editable MonetDB-related content, as well as for some coordination within our group.

Consult the MediaWiki User Guides page for detailed guides on using various aspects of MediaWiki (the platform on which this Wiki is built).

Getting started with MonetDB[edit]

Internal[edit]

Software development[edit]

Project related[edit]

Organization[edit]

SciLens cluster[edit]

The SciLens cluster has been acquired to provide a sizable and flexible experimentation platform for the Database Architectures group at CWI. Access is granted to members of the DA group.

SciLens usage policy[edit]

In case you are cooperating with non-DA-members that are (temporarily) granted access to our SciLens cluster (there are no non-DA people that have access to our SciLens cluster without cooperation with us!), please make sure to (1) inform them about our usage (reporting/claiming) policies (see above) and (2) introduce them and their work to the group

If you plan to use machines in our SciLens cluster please make sure you report your usage (plans) and claim machines the usual way via the dedicated wiki pages. Please do not forget to release them, again, once you're done with using them! Do not hesitate to ask, in case you have any questions!

SciLens backup[edit]

The data on the SciLens machines is not backed-up by ITF. To ensure a continual use of your data, you either have to find a backup storage place within the cluster, your desktop, external, or be able to re-generate your settings. In case of doubt or advice where and how to make your backup, contact our system administrator, Arjen de Rijke.

Logging into the cluster[edit]

Your first login[edit]

Logging into the cluster is regulated through the scilens2-ssh (virtual) machine. Initially, you _cannot_ access it; please ask Arjen for access. With his help you'll obtain a special license file for use with the cluster, which you can place at $HOME/.ssh/id_rsa_scilens on your machine.

Thereafter you can SSH to the scilense2-ssh gateway machine:

[someone@somedesktop ~]$ ssh -A -i $HOME/.ssh/id_rsa_scilens scilens2-ssh.da.cwi.nl 

and subsequently move to the specific machine you desire to use, e.g.

[someone@scilens2-ssh ~]$ ssh bricks09


Make sure you can ssh into the cluster via the gateway machine. On the gateway machine a key has been created for you, probably named: .ssh/id_rsa_scilens_your-username. This key should be copied to the desktop .ssh directory.

Add this information to the config file $HOME/.ssh/config on your CWI desktop machine:

Host scilens2-ssh.da.cwi.nl
        User your-username
        IdentityFile=/home/your-username/.ssh/id_rsa_scilens
        ForwardAgent=yes
Host bricks* rocks* pebbles2* diamonds* stones* gems*
        User your-username
        IdentityFile=/home/your-username/.ssh/id_rsa_scilens_<username>
        ProxyCommand ssh scilens2-ssh.da.cwi.nl -W %h:%p

Note that technically there are 3 different usernames involved: On your own machine, on the gateway machine and on the target machine in the cluster; if for some reason they're not all the same, for each machine's entry, the your-username is the one on that machine.

You will now be able to issue an SSH command directly:

[someone@somedesktop ~]$ ssh bricks09

and the connection will be automatically proxied over an SSH connection to the gateway machine.

Notes:

  • Remember to make $HOME/.ssh/config readable only by you if you've just created it.
  • Add the keys to your key chain using ssh-add .
  • This might conflict with other hostnames beginning with bricks, rocks or pebbles2.

Disk usage on the cluster[edit]

When you login to one of the SciLens machines, you land in your home directory on that machine. But - try to avoid using this directory: It is mounted on a small disk, shared by all users, which is deleted when the operating system is reinstalled. Restrict yourself in using it to configuration settings files, symlinks etc.

For most practical cases, you should make yourself a directory in /scratch or /data where there should be ample of disk space.

Copying data in and out of the cluster[edit]

As explained above, all communication into into the cluster is mediated by ssh through the scliens-ssh virtual machine, and this is specifically true when you want to send files in. Fortunately, if you've configured [#Automation of proxying|automatic proxying], all SSH-related utilities will work. You can thus push and pull files from the outside:

[someone@somedesktop ~]$ scp foo bricks09:/scratch/someone/bar
[someone@somedesktop ~]$ scp bricks09:/scratch/someone/bar foo

What about pushing and pulling from the inside? Well, it seems [?] you're limited to using key files for authentication going out of the cluster to our desktop machines, so you should generate yourself an (RSA) key pair on the cluster machine, e.g.

[someone@bricks09 ]$ ssh-keygen -t rsa

and then you can do

[someone@bricks09 /scratch/someone]$ scp foo somedesktop.ins.cwi.nl:bar
[someone@bricks09 /scratch/someone]$ scp somedesktop.ins.cwi.nl:bar foo

as well.

Note: if you don't provide scp with a path for the local file, it will use the current work directory ($PWD) is used; but if you don't specify a path on the remote machine, and just write machine:filename, you'll be referring to your home directory on that machine).

Copying & synchronizing data within the cluster[edit]

Copying files within the cluster is straightforward using the scp command identifying the machine and directory locations involved, e.g. while on bricks09 you can move the file data to bricks10

scp data bricks10:/scratch/your_username

You can also use the rsync command to clone your environment easily on multiple machines. It requires a .rsync file in your homedirectory with directives. Create a rsync daemon configurationfile in your /scratch/your_username directory containing:

port = 2873
use chroot = no
[scratch]
path = /scratch/your_username

The port number is free to choose above 1000, otherwise conflicts with others users might occur. Let's assume again you are on brick09 and you want a synchronized copy on bricks10. On bricks09 create the above configuration file and start the rsync daemon with the command:

$ rsync --daemon --config=/scratch/your_username/rsyncd.conf

Then login onto bricks10 in your /scratch/your_username directory execute the command:

$ rsync -aH bricks09:/scratch/your_username/* /scratch/your_username/

For further details see the rsync manual or contact the expert. After the rsync is complete you should terminate the rsync daemon on bricks09.

Returning small files from the SciLens machine to your desktop can be performed using scp. For big ones contact the expert. Importing large amounts of data into the cluster, contact the expert.

Printing[edit]

Files to be printed should be sent to the CWI print spooler and a specific printer, e.g.

lpr -H spool.cwi.nl -Ppear <files>

Accessing the internet[edit]

You can use wget to download any page or file available on the WWW from within the cluster.

You have to contact the expert if you want to run a web-client/server setup on the machine, as this will require a custom SSH tunneling setup.

Root-specific features[edit]

Some tools have been put in place to make your life easier with respect to performance analysis, which require root privileges to use. These have to be called with the sudo command, e.g.:

$ sudo iotop

Pre-installed user requested packages[edit]

External software that requires root permissions can only be installed from source in your local environment. For example, postgresql, mysql and friends.

Non-installed libraries available within the Fedora distributions can be enabled by contacting the expert.

Specific application frameworks, e.g. java, can be installed from the Fedora repository, but due to versioning issues we advice to use a local copy as much as possible. When in doubt follow the expert route.

Work with multiple machines[edit]

With the command clush, one can execute the same command on multiple SciLens machines simultaneously. Assume you are logged in scilens-ssh, then run:

$ clush -w bricks[01-16]

This will give you a prompt. Any command you type in here, e.g., 'df -h /scratch', will be run on the machines bricks01 through bricks16. Try it out to see the results produced by the selected bricks machines.

The -w option allows you to pass a list of machines you want to work on.

In theory, you can address all rocks machines with one clush command: clush -w rocks[001-144]. However, in practice, it is advisable to work with smaller groups of machines, say, 20. This also makes it easier to cancel a command if one of the machines freezes.

With quit you can stop clush. For more information, please see the man page of clush.

SciLens cluster hardware[edit]

  • Non-standard hardware
    • diamonds
    • stones2
    • stones
    • bricks
    • rocks2
    • emeralds
    • pebbles2
    • (gems)
    • (rocks+)
    • (rocks)
    • (pebbles)

SciLens cluster use[edit]

Visit the SciLens Cluster Use page to see which machines are available for use and to register your own intended use of the cluster.

Please do always claim machines using the registration mechanism before you start actually using them! And remember to release machines once you're done using them!

Additional resources for messing around with this wiki[edit]