July 2010

Friday, July 30, 2010

Rubbery Discharge During Pregnancy

Scientists break the terabyte barrier system of

Computer Science at the University of California, San Diego (USA) broke "the terabyte barrier"-and a world record by ordering more than one terabyte of information (1.000 gigabytes or 1 million megabytes ) in just 60 seconds. In competition Sort Benchmark

-the "World Cup of ordering data" - computer scientists at the Jacobs School of Engineering at UC San Diego also tied the world record for the fastest rate system. Ordered a zillion records in 172 minutes and did it with only a quarter of the computational resources of the previous record. Companies

for trends, efficiency and other competitive advantages have turned to this kind of arrangement which requires heavy hardware processing power of typical data centers. The Internet has also created several scenarios where the information system is critical. Advertising on Facebook pages, personal recommendations from Amazon, and search results as Google-second of all are the result of huge data sets to order multiples of petabytes. A petabyte is 1,000 terabytes.

"If a big corporation would run a query across all visitors to their sites or products sold, may need to order a set of multi-petabyte data and especially those that grow to several gigabytes per day," computer says Professor UC San Diego Amin Vahdat, project leader. "Companies are taking to limit the amount of information that can be ordered, and how fast. This is analysis of information in real time, "said Vahdat. We need better management technologies, however. " In data centers, the system is the most common bottleneck in many high-level activities, "said Vahdat who directs the Center for Network Systems (CNS) at UC San Diego.
The two new world records for the UC San Diego are among the recently disclosed results sortbenchmark.org - site run by computer scientists from academia and business volunteers who run the competitions. These powers provide marks in terms of data systems and a forum Interactive researchers working to improve management techniques. World Records

This is the first year that scientists enter the competition and win in the Indy Land in a Minute and Order Indy Gris.

In the first, the researchers instructed 1.014 terabytes in a minute - breaking the minute barrier for the first time.

"We set our agenda for research on how to improve it ... also make it more generic," says doctoral student in computer science at UC San Diego Alex Rasmussen, leader of the team graduate students.

team also tied the world record in the "Land Indy Grey" which measures the rate per minute system 100 terabytes of information.

"We use computers a quarter of the previous team record was used to achieve the same rate law - which meant using only a quarter of electricity, cooling and physical space," says George Porter, a research scientist in CNS of the UC San Diego.

Two world records fall under the category "Indy" - which means that the systems were designed around specific parameters for competition. The team aims to generalize their results to the competition, "Daytona" and that can be used in real environments.

"The system is also an interesting way to various information processing problems. In general, it is a good way to measure how fast can you read a lot of data from a disk set, apply some processing, distributed by a network and write another set of disks, "said Rasmussen. "Sort puts too much pressure on the subsystem input / output, from hard disks and network cards to the operating system and applications." Balanced Systems

The challenge of sorting data that scientists are taking very different from the modest systems that systems Conventional databases can be done by comparing two tables. The biggest difference is that the laws of terabytes and petabytes of data beyond the memory capacity of the server that makes it.

establishing the system of heavy-duty system, scientists are designed to roll and quickly. A balanced system is one in which resources such as memory, storage and network bandwidth, are fully exploited and only a few resources are wasted.

"Our system shows what is possible if one pays attention to efficiency - and there is still much to improve," Vahdat says "We ask the question What does build a balanced system in no system resources are being missed having a high performance computing? If you have unused or idle processors all the RAM, you're wasting energy and losing efficiency. "Memory often uses the same energy as a processor or more including, for example, but nobody notices that.

Breaking terabyte barrier under the national Indy in a minute, the researchers built a system made of 52 computer nodes. Each node is a standard server with two quad-core processors, 24 gigabytes of memory and 16 drives all interconnected by a Cisco switch Nexus 5020. Cisco switches donated as part of the research sought they have with the Center for Network Systems at UC San Diego. The computer cluster is hosted at the Institute for Telecommunications and Information Technology California (Calit2).

To win the Indy Land Gray, investigators ordered a zillion records in 10,318 seconds (approximately 172 minutes), leaving his world record tied with a ranking of 0,582 terabytes per minute per 100 terabytes of data. The winning system is made of 47 nodes similar to those used in the order of a minute.

100 terabytes of information equal to 4,000 Blu-Ray of a layer, a layer DVDs 21.000, 12.000 dual-layer DVDs or CDs 142.248 (assuming are CDs of 703 MB).

Way: Dr. Dobbs

Tuesday, July 13, 2010

Microfiber Vs. Microsuade

And you what that DBA are you? Avoiding

book "Oracle 11g Database Administrator for Advice Underground's - Beyond the basics - real-world DBA A survival guide for Oracle 11g database Implementations"

written by April Sims and published by Packt Publishing who lists the following activities rightly DBA should be met, taking into account of course the work environment as they do not all apply to all cases. For example, a person who was hired to be in charge of say 20 databases with which the company operates, has many more responsibilities that a consultant as-first-a consultant will charge to the extent of consumption of hours care and generally the cost of this service is not cheap. Of course, depending on the contract, the consultant may from diagnostics in this or that problem that the present database, to implement solutions to correct notable performance problems or implementing new functionality all in favor of organizational software process improvement. You could say that both have similar profiles but with different specializations.

What does a DBA do all day? The responsibilities are , install, configure and administer the database, these responsibilities can be divided into scheduled tasks to run at certain intervals. This is a generalized list and depending on the workplace may or may not apply.

Tasks such as monitoring and log rotation can be done via Enterprise Manager, Grid Control, Unix shell scripts, DBMS_SCHEDULER, Perl, database tools, third-or combinations of some of these. Prioritizing activities, daily, weekly, monthly, quarterly or annual
Let's see what are the priority activities needed to cover. The schedule depends on the workplace needs of the application and having the weight activities in general.

Daily Backups - are usually incremental and cumulative, one full week and the logs are stored and mailed to the DBA in case of failure.
Alert Log database - such as ORA errors, automatic email notifications, messages to pagers.

ADRCI - Automatic and Log Rotation Utility Repository.
filesystem space, CPU statistics and I / O - Manager requires OS support.
SQL statements - Statements that are among the top 5 and 10.
Corruption - RMAN logs, logs of exports and / or DataPump, dbverify, v $ database_block_corruption
.
tablespaces Growth - grow them, partition management, temporary tablespaces, undo. Data Guard - check in the logs that the application / transport is in sync. Logs
SQL * NET listener - intrusion detection.
Audit logs and evidence - intrusion detection, removing unused accounts.
Dumps core and user - space they occupy, Oracle bugs.
Creating new accounts - should be at least partly automated.
Update users about security - at least 24 hours in advance. Migrating
diagrams and code changes or updates specific to SQL.
Growth of large tables, uniform growth of tablespace.
Keeping track of daily changes to the database - for some people post of IT staff.

Weekly Backups - usually the entire database.
database cloning for non-productive - or automated scripts. Growth

tablespaces - the daily accumulated in a week.
Improved version of Oracle or migration projects for patching - Updates significant.
Test Data Guard site. Review
updates My Oracle Support (MOS) - new patches, updates or new versions.
Updates on local intranet on operational procedures.
Monthly

database cloning for non-productive - or automated scripts.
growth monitoring tablespaces - the weekly accumulated in a month.

Trends and Forecasts - CPU consumption, statistics I / O, access
Change passwords in production - sys, system, wallet, schema, control grid, OAS. Licenses
use Oracle and it covers. Implement
recovery scenarios.
Quarterly

Application CPUs (Critical Patch Upate) and PSUs (September Patch Updates) production planning in suspension of service. Application of CPUs, PSUs instances once unproductive.
growth monitoring tablespaces - the monthly accumulated in an annual update courses

Oracle - Oracle University (online or classroom), books, informal meetings. Accumulation
trends and forecasts.

Annual Growth tablespaces - annual report.
Sum of trends and forecasts.

Oracle Go to conferences - groups of local or national users.
Oracle Updates planned service suspended - version + patches + PSUs + application once.
Software Licensing and renewal services.
Evaluation and updating of hardware.
SSL Certificate Renewal, Oracle Wallets.
So, it looks like a bleak set of activities carried out, however, it has the support of tools such as OEM, Grid Control, third party monitoring or custom-made scripts. That is why it is the reiteration that the automation of these tasks is of great importance.

If generalize several of the points mentioned above, we will apply not only to the Oracle platform but can also serve as a reference for managers manage other databases.

Wednesday, July 7, 2010

Pokemon Black And White Birthday Invitations

mess up (seriously) and Oracle DBA

-Underground Advice for Database Administrators. Beyond the basics. A real-world survival guide for Oracle DBA 11g database

Implementations" publisher Packt Publishing written by April C. Sims, makes an excellent summary of situations or activities that Oracle DBA happen if they could make us a hard day or our boss or the boss's boss. Be cautious, focusing on what is done and in some cases to double-check what you are going to run and on which server is only part of the performance to make your job more if it bears the title "DBA" . Seriously, some people have the wrong idea about what a DBA, but that's a topic for another entry.

These are the points A. Sims is well to mention.

Avoid using rm-rf *.*
at any time and for whatever reason, try to be specific

rm *. log or *. lis or *. trc: it is more likely to support the board and make better use rmdir . It would be even better if you rename the whole directory and then you let one or two days. assume that all files in a directory that you see belong to a single database is an omen of disaster, these files can be created anywhere in the filesystem to which Oracle has write access. Modify access to productive instance at SQL * Plus is not a rule and usually not given directly to the developers unless a person take over that responsibility as the head development.
is good to use Unix tool
fuser against a file to determine whether or not to be used before executing commands
rm or mv . Another way is to force a checkpoint and check the timestamp before disposal. If the file is active, the timestamp is updated. Add ORACLE_SID and prompt the user to SQL. This will prevent disaster to visually inspect the prompt before running a script as you think it's a production server. Use a extended at Unix prompt that contains the hostname, username and ORACLE_SID which will add more tracks visual you make sure you know exactly what you want to modify.
Copy and paste directly into SQL * Plus or other command-line tool can cause it to execute the wrong code. Best copy and paste into a text editor. Czech thus exactly what is in the buffer copy / paste. Type the word
production in the command line window after you finish using it. This will prevent accidental disasters when switching between windows if you run something wrong. Fail only because there is no command called production.
Good recovery scenarios run in a different server production. In addition to testing at the operating system restoration. The site where will the disaster recovery should be on a different server to have real permutation capabilities. sure know how to use all the tools at Oracle from the command line and use the vi editor
Unix in case you do not count on anything more at your disposal. Preferably
change the colors of environments and applications, windows or command line tools like Putty that are connected to productive environments which are connected to non-production environments and to increase the size of the largest history saved possible. Unix has a history capture tool called script.
Warn that you will make a change ... case. Say it aloud can give time for someone to stop or at least confirm that your gonna do that. scripts log file rotation can cause havoc if you named the online redo logs to log
extension. It would be safer to use the extension
rdo. unknown An outside consultant will not necessarily give the best advice. Be cautious until you are sure of your expertise and skill. If possible, ask you to work under him so you know what that is doing. Use the number 8 on any type of script in the ORACLE_SID
or what you are going to support to run something, it can wreak havoc when run as the wildcard character * is above the 8-it's easy typing mistakes and accidentally.
You always an exhaustive review on the performance of the server operating system, specifically that do not run out of disk space. Be careful when using the 'reuse' when you add or modify a database file. The command overrides the current datafile destroying any information. Be wary
scripts generated by third-party tools instructions can be powerful. A script can recreate a deleted object first which can be disastrous if that was not backed up information.
You are responsible for the backups. It is advisable to delegate in any way.
sure to investigate the use of resources by users who have special access in production. These users can easily take over CPU or I / O that are necessary for OLTP applications.
the Unix root account is not for everyday use and especially for dedicated tasks for Oracle. Investigates the use of sudo
to track activities approved root.
This is the most important tip to avoid carrying the leg-in case of doubt it is best to do nothing that you can not undo, reverse or fix.
the list mentioned good practices that should not be forgotten in the daily tasks of a DBA. The consequences of committing a serious error in a production database can range from a reprimand to losing a job or getting into legal trouble.

Monday, July 5, 2010

What Colour Matches A Green Carpet

One of the most popular icon sets to date are those created by Brazilian graphic designer Everaldo Cohelo

The icon set is licensed LGPL (GNU Lesser General Public License)

If you need to give your application a graphic touch more sure that these icons will be useful. Currently Everaldo runs his own graphic design consultancy from Yellow.com
employing 12 people in the same area.

Who says that be made available through open source projects is not profitable? The complete set in many different forms and formats as are here:

http://openiconlibrary.sourceforge.net/downloads.html

previews Wikimedia will find: http://commons.wikimedia.org/wiki/Crystal_Clear

Sunday, July 4, 2010

Best Face Cleanser Salicylic Acid

Crystal Clear Alternatives

A new generation of database software for high-performance and low cost is emerging very quickly coming to challenge the dominance of SQL in distributed processes and applications with large volumes of data. Some companies already replaced the high functionality of SQL with these new options that allow them to create, work and manage huge data sets. NoSQL

is that different implementations of web and business applications of cloud computing

have different requirements for their databases. Not every application requires a strict data consistency, for example.
addition, when an application uses distributed data in hundreds or even thousands of servers, the numbers (money issue) point to the use of server software at no cost rather than paying licenses per processor. After resolving the issue of licensing costs, can be scaled horizontally with commercial hardware or opt for cloud computing services and avoid a big payout of entry. The previous tools do not always facilitate this. Challenges to the hegemony of SQL originate specialized products built from scratch for large-scale analysis and storage of documents as well as for the construction of systems that require high availability rather than consistency when it comes to partition data.
Applications such as those of online transaction processing, business intelligence, customer relationship management, document processing and social networks do not have the same needs for information, questions or types of indexes, or have equivalent requirements of consistency , scalability and security.

For example, applications
BI (business intelligence)
running queries for analysis and decision- decisions that can take advantage of bitmap indices for operations gigas databases or terabytes in size. Web analysis, drug discovery, financial modeling and other similar applications turn toward distributed systems for their efficient processing of data sets gigas or balcony.

OLTP stands for reliability. And social networking applications like Facebook and Amazon.com have adopted properties BASE (basically available, flexible, eventually consistent) above the known ACID (atomicity, consistency, isolation, durability) to support their communities mass million Web users. These differences are one reason why no NoSQL databases relational databases focused on vertical storage documents and have gained strength. They are more like specialized tools instead of Swiss Army Knives with SQL platform functionality. systems architects should consider the characteristics and specialized functions needed by an application when choosing a database. NoSQL The database can be built specifically for functions such as BI, OLTP, CRM

, social networks and data warehousing, and include features such as scalability, partitioning, security and versatility.

Scalability and High Availability to cloud computing and web sites with high volumes of data such as eBay, Amazon, Twitter and Facebook, scalability and high availability are essential. In fact, they are the reason why in the distributed databases have relaxed the requirements for consistency.

systems operating in high availability environments must survive failure of software, hardware and network, and be ready to climb despite the unpredictable demand for computing resource. One approach to building systems is the use of distributed database with a shared-nothing architecture

and horizontal partitioning

. Elasticity and fragmentation (partitioning) - both caracteristics NoSQL - are horizontal scaling solutions that provide availability and processing of large volumes of data. A variety of data stores are gaining popularity for creating web site applications scalable and resilient environments such as public or private tag. The stores distributed key-value are great when you do not need to apply SQL rules, stiff consistency, complex queries, integrated queuing or the ability to run database operations that exceed the available RAM. New data stores offer low-latency scalability Applications that do not require elaborate consultations and analytical skills. Amazon SimpleDB and Google has developed Bigtable. Other low-latency options open source include Cassandra, Hypertable, MongoDB, Project Voldemort, Redis, Tokyo Tyrant and Dynamo database used to Amazon until March V3 contained 102 billion objects.

Options Bigtable
Google has developed to distribute data across thousands of servers and scale to data sets in the order of peta bytes. Applications such as web indexing, Google Earth, Google Maps, Blogger, YouTube and Gmail use it. The collection of 100 million videos on YouTube requires 600 TB of space. Bigtable is owner, but the data model exists in open source implementations Hypertable, Cassandra and HBase. Bigtable can be used as input or output of MapReduce, which allows distributed processing of files or databases using reduction mappings or functions.

Dynamo was created to provide a data store key-value high-availability, change without losing the data due to server failures or network problems. Amazon SimpleDB later built as a key-value store available to customers of Amazon Web Services. SimpleDB is limited to no more than 256 attributes of name-value pairs, domains larger than 10 GB and databases of no more than 1 TB. Amazon said that copies of the data is updated in a second to maintain consistency. SimpleDB uses a query language like SQL. Project

Voldemort, an open source clone of Amazon's Dynamo is a data warehouse that supports key-value versions eventual consistency (where the database sometimes returns the wrong answer in order to maintain its size), and Automatic partitioning and replication. Keys and values \u200b\u200bcan be complex objects such as maps or lists. Voldemort Project supports the construction of distributed data stores in shutdown mode. LinkedIn developers created it, and sites like Lookery is used. Cassandra

integrates the data model Bigtable distributed design with Dynamo. Provides eventual consistency, not rigid consistency that e-commerce transactions and transactions in the stock required. Instead of storing the data in sequence row-major or column-major, Cassandra uses the order inspired by Bigtable ColumnFamily.

Cassandra is geographically distributed across multiple data centers, as are the areas of availability of Amazon EC2. Bulk can be made with Hadoop.

Climbing Cost of

SimpleGeo, a provider of geographic data, Cassandra uses Apache, Open Source NoSQL to avoid licensing costs managers of commercial databases as part of its effort to scale to an architecture of multi-database.

"Run a cluster of 50 nodes, covering three data centers Amazon EC2 service paying about $ 10,000 a month," says chief technology officer of the company Joe Stump, who previously used Cassandra on Digg. "In contrast, the premium support of MySQL would cost about USD $ 5.000 per node per year, or $ 250,000 for every year - more than double by the implementation of Cassandra" Added Stump, "and Microsoft SQL Server can cost up to $ 55,000 per processor a year. "

"USD $ 10.000 an operating expense is the opposite of spending older, and that is 'a nice little tax,' "he says.

Cassandra provides availability and scalability to a number of well known sites, including large communities of Twitter and Facebook. When the number of Twitter users off, migrated a combination MySQL MySQL / memcached running over 45 nodes Cassandra. This mixed environment is now responsible for 50 million tweets per day. Facebook adds about 60 million photos a week using Cassandra. In Digg, Cassandra manages about 3 TB of information.

Digg announced with great fanfare its move from MySQL to Cassandra. The main reason for moving Digg platform was "the problematic he turned to the application had a high performance when writing (in the database) became intense in a data set that is growing rapidly, and which does not look so, "said John Quinn, vice president of engineering Digg. Growth Digg forced to take strategies horizontal and vertical partitioning that eliminated most of the concepts of a relational database, and yet there was overload, "says Quinn.

"Our system is growing rapidly and needs to be provided with performance and redundancy with multiple data centers and to add capacity or replace faulty nodes immediately. As for the consistency of information Digg engineers can implement application-level controls more efficiently with MySQL Cassandra, "said Quinn.

Tokyo Tyrant is a database server open source, accompanied by a text search engine, which is tracking NoSQL community. It is a database of key-value with a hash index structure and b-tree, able to insert 1 million records in 0.4 seconds and run record 58.000 queries per second. Supports asynchronous replication and transaction processing
ACID properties and transaction log
premature. Can be used with various programming languages, including Perl, Java, Ruby and PHP. Deployments Products include Scribd and Mixi, the Japanese equivalent of Facebook. Tokyo Tyrant LightCloud modified in a distributed database by adding a layer horizontally scalable universal hash. The daily social Plurk LightCloud use this option Tokyo Tyrant.

Database Stores Records MongoDB CouchDB and are examples of databases of documents JSON class, while a large number of products that store encrypted documents in XML format. MongoDB is a popular product architecture based on the database client-server b-tree indexes and communication over TCP / IP.

MongoDB JSON object manages collections and provides scalability

fragmentation and replication. Consultations are JSON objects in addition to providing geospatial search also in 2-D. There are APIs for various languages \u200b\u200bincluding controllers JavaScript, Java, Perl, PHP, Python, Ruby and C + +. Among the products they use MongoDB implementations include Justin.tv, The New York Times, Disqus, Electronic Arts and Business Insider.

CouchDB is a data warehouse that has no outline and provides a REST-style API for CRUD (create, retrieve, update, delete) on documents. CouchDB can make recoveries using key values \u200b\u200band can operate with Hadoop MapReduce for trivial queries. You may also generate views using JavaScript. Creating a vision can take time, but subsequent queries that are used are very fast. CouchDB supports multimaster replication and distribution of data across multiple instances. Manage documents in JSON format and uses the SpiderMonkey JavaScript engine and it is very appropriate for web applications, as Erlang, HTTP, JavaScript, PHP, Python and Ruby.

are preferred for applications where XML documents and XQuery on documents and queries JSON in JavaScript, there are a number of open source products and commercial. In addition to the XML document repositories, there are dozens of XQuery processors. The list includes Apache Xindice, Berkeley DB, eXist-db, IBM DB2, MonetDB, Mark Logic, Sedna, Tamino and TigerLogic XDMS WebMethods. Distributed Processing

When it comes to distributed processing of massive data sets, Hadoop MapReduce has become the ultimate technology. Researchers at Yahoo, for example, used it in 3.800 nodes to order a petabyte of information in 16.25.

Google MapReduce recently developed and patented. The mapping function produces a list of key-value pairs that MapReduce makes a list of values.

Apache Hadoop Project includes the Hadoop Distributed File System (HDFS), MapReduce, Database HBase, Pig analytical language, the query and analysis tool Hive, among others. HBase is a vertical storage (per column) distributed, modeled after Google Bigtable that can serve as input or output for MapReduce.

HBase is one of the many stores competing vertical market analysis and business intelligence. Store tables in order to column-major provides substantial performance improvements over the tables stored in row-major. Benefits such as improved location and performance of the cache makes the reading performance improves, but the write performance is poor. Other stores include vertical Sybase IQ, Vertica and CStore which is a collaboration among several universities and is open source.

The growing interest in semantic search and related information has been in the spotlight to triple stores
RDF (Resource Description Framework)
. These options include AllegroGraph, BigDate, Garlik, Jena, Big-OWLIM Ontotext, OpenLink Virtuoso, Oracle 11g and Sesame. Several of these have been deployed to Amazon EC2 to exploit the distributed processing power of the cloud. Raytheon BBN researchers also used to create a Hadoop MapReduce distributed RDF store that supports SPARQL query processing

. Restrictions and Best Practices To ensure durability and integrity of information, provide SQL databases and transaction log replication. NoSQL options need something similar. Cassandra, for example, supports both. Tokyo Cabinet and support
HBase
register early. Tokyo Cabinet and CouchDB supports master-master replication, while MongoDB supports master-slave replication and replica pairs.

employing architects oriented databases must deal with documents and store each type of document and whether to have or not a database for each type. Instead of separate databases can include an attribute that specifies the type or use libraries. The new generation of stores data is intended to serve the needs of availability and scalability, although certain restrictions apply to achieve greater efficiency. With Amazon SimpleDB, for example, the maximum time that can last a query is 5 seconds. If the query takes longer, SimpleDB returns a partial result and the application should make further consultations to complement it. SimpleDB restricts the result of a query to a maximum of 250 items, while Google recently increased the maximum result of a data store query AppEngine to 1,000 items.

In horizontally partitioned systems queries that need to cross-fragmented joins are expensive, hence the design partitioning algorithms require skill and knowledge in data usage patterns. When you need complex queries with aggregation, NoSQL databases are not a good option, but can be a source of data for separate solutions in charge of analysis. Organizations that use data warehouses key-value capabilities sometimes need indexing and SQL query. May use other programs that support indexing and query as Apache Lucene. Whether your organization uses SQL databases or NoSQL is a good idea to use version control and separate databases for testing and production.

For all areas that address the options NoSQL, we are still left with the question of which database software database taken. The answer depends on basic things: how much and what type of data is stored? Will it be used for complex queries? How many concurrent users will you have? "Climbing the database as increase the number of users and information? SQL or NoSQL, is first defined. Via

Dr. Dobbs