The PennHPC facility opened in April of 2013 to meet the increasing growth in genomics processing and storage, as well as growth in other scientific areas requiring computational capacity such as imaging and biostatistics/bioinformatics. The cluster is managed by two fulltime system administrators, and is located at the Philadelphia Technology Park, a Tier-3, SSAE 16/SAS 70 Type II Audit compliant colocation/datacenter facility.
"The 144 IBM iDataPlex cluster nodes each house two eight-core Intel E5-2665 2.4Ghz Xeon Processors, 192 or 256 Gigabytes of RAM, and a 500 Gigabyte local hard drive. These nodes are sub-divided into virtual processing cores, with the capability to provision up to 5,100 virtual cores at 6GB of RAM per virtual core. Cluster and storage interconnection is provided by a 40Gbps Ethernet fabric with 10Gbps to each node. The cluster nodes are attached to 1.8 Petabytes of IBM Storwise V7000 disk storage, housed in two separate performance tiers (no backup). The disk is presented to the compute nodes via an ten-node IBM Scale out Network Attached Storage (SONAS) system leveraging the IBM General Parallel File System (GPFS). Computational job scheduling/queuing and cluster management is orchestrated by the IBM Platform Computing (LSF) suite of products."
"The cluster nodes are attached to 1.8 Petabytes of IBM Storwise V7000 disk storage, housed in two separate performance tiers (no backup). The disk is presented to the compute nodes via a ten-node IBM Scale out Network Attached Storage (SONAS) system leveraging the IBM General Parallel File System (GPFS)."
SpectraLogic T950 tape library houses 1910 LTO-5 tapes, 290 LTO-6 tapes, and 12 LTO-6 drives with a total raw capacity of 3.6 Petabytes of storage.
"The PMACS HPC system houses 1.8PB of mirrored tape storage for both archive and backup uses. Archive storage provides clients with a primary data storage target that will look and function as any other hard drive, though often at a slower rate of data transfer. The archive is 100% client accessible, and clients need only to mount the drive on their computer and begin to copy/move data to/from the drive. PMACS HPC staff will provide the necessary information to mount the drive, and assist clients with initial setup. Data can be moved from an existing drive share on the PMACS HPC disk system, a scientific instrument, a desktop, or any other device that can map an NFS or CIFS share.
Archive storage is NOT meant for any data that will be regularly accessed, nor any data upon which computation must be performed. The archive is “active” in that end-user files are available in real-time, however the purpose of the archive is long-term storage of infrequently access files.
Archive performance is variable, and will depend upon how heavily the system is being used at any given time. No data ingress or egress service levels/performance are guaranteed, however average data retrieval times will be available once the system has been operational for a period of time over which performance metrics can be calculated. "
"There is no charge assessed for data transfer to/from the PMACS HPC system."
"The PMACS HPC houses 1.8 Petabtyes (1,800 Terabytes) of disk storage. Some of this storage is used for system utility functions; however most of the storage (about 80%) is available for the storage and processing of HPC jobs.
Data is accessible via both NFS (Linux/Unix/Mac) and CIFS (Windows) drive shares. Shares can be mapped to client desktops, laptops, as well as biomedical equipment, such as high-throughput sequencers capable of mapping NFS and/or CIFS shares.
Once data exists on the PMACS HPC disk storage it is then accessible by the compute cores and queue management software, and can become part of computational jobs.
Storage rates apply to any data stored whether short or long-term. Rates are calculated regularly throughout the month.
Disk storage costs do not include backup services. Data protection included in disk storage costs is based upon a technology called Redundant Array of Independent Disks (RAID) which protects data should a drive or drives in the system fail. Backup services are available, however backup services incur the “Archive/Backup Storage” rate detailed below, and are on an as-requested basis. "
"The computational hour can also be referred to as a “core-hour” or “virtual-core-hour.” The system is configured in such a manner as to allow more computational-hours than the number of physical processors in the system. This virtualization provides a greater computational capacity than if each physical processing core was utilized individually, therefore minimizing queue wait times for jobs to complete. Each virtual core is assigned up to 8GB of RAM, however alternate RAM configurations are available as-needed."
"Cluster Operating System is Red Hat Enterprise Linux v. 6.4, and this version is installed on all compute nodes. Alternate OS installations are available on an as-needed basis, by request only. Cluster software is either loaded on the individual node hard drives by HPC staff, providing the best performance, or is run from a client’s share on the SONAS file system, generally not requiring any HPC staff involvement. Clients needing software installation should place a service request in the PMACS HPC queue, or e-mail to email@example.com."
"The PMACS HPC uses the IBM Platform LSF job scheduling system."