public void sumoverthreadsandmpi() { for (int ThreadNo = 0; ThreadNo < NumberofThreads; ThreadNo++) { TotalNumberofPoints += NumberofPoints[ThreadNo]; } Parallel.For(0, Program.ParallelOptions.MaxDegreeOfParallelism, Program.ParallelOptions, (ArrayThreadNo) => { int beginindex = ParallelArrayRanges[ArrayThreadNo].StartIndex; int indexlength = ParallelArrayRanges[ArrayThreadNo].Length; for (int ArrayLoop = beginindex; ArrayLoop < beginindex + indexlength; ArrayLoop++) { TotalVectorSum[ArrayLoop] = 0.0; for (int ThreadNo = 0; ThreadNo < NumberofThreads; ThreadNo++) { TotalVectorSum[ArrayLoop] += VectorSum[ThreadNo][ArrayLoop]; } } }); if (DAVectorUtility.MPI_Size > 1) { DAVectorUtility.StartSubTimer(DAVectorUtility.MPIREDUCETiming1); TotalNumberofPoints = DAVectorUtility.MPI_communicator.Allreduce <double>(TotalNumberofPoints, Operation <double> .Add); TotalVectorSum = DAVectorUtility.MPI_communicator.Allreduce <double>(TotalVectorSum, Operation <double> .Add); DAVectorUtility.StopSubTimer(DAVectorUtility.MPIREDUCETiming1); } }
public void sumoverthreadsandmpi() { for (int ThreadNo = 0; ThreadNo < NumberofThreads; ThreadNo++) { TotalNumberofPoints += NumberofPoints[ThreadNo]; Totalmean += mean[ThreadNo]; Totalsquare += square[ThreadNo]; } if (DAVectorUtility.MPI_Size > 1) { DAVectorUtility.StartSubTimer(DAVectorUtility.MPIREDUCETiming1); TotalNumberofPoints = DAVectorUtility.MPI_communicator.Allreduce <double>(TotalNumberofPoints, Operation <double> .Add); Totalmean = DAVectorUtility.MPI_communicator.Allreduce <double>(Totalmean, Operation <double> .Add); Totalsquare = DAVectorUtility.MPI_communicator.Allreduce <double>(Totalsquare, Operation <double> .Add); DAVectorUtility.StopSubTimer(DAVectorUtility.MPIREDUCETiming1); } if (TotalNumberofPoints < 0.5) { return; } Totalmean = Totalmean / TotalNumberofPoints; Totalsquare = (Totalsquare / TotalNumberofPoints) - Totalmean * Totalmean; Totalsigma = Math.Sqrt(Math.Max(0.0, Totalsquare)); }
public void sumoverthreadsandmpi() { for (int ThreadNo = 0; ThreadNo < NumberofThreads; ThreadNo++) { TotalNumberofPoints += NumberofPoints[ThreadNo]; for (int ArrayLoop = 0; ArrayLoop < ArraySize; ArrayLoop++) { Totalmean[ArrayLoop] += mean[ThreadNo][ArrayLoop]; } } if (DAVectorUtility.MPI_Size > 1) { DAVectorUtility.StartSubTimer(DAVectorUtility.MPIREDUCETiming1); TotalNumberofPoints = DAVectorUtility.MPI_communicator.Allreduce <double>(TotalNumberofPoints, Operation <double> .Add); Totalmean = DAVectorUtility.MPI_communicator.Allreduce <double>(Totalmean, Operation <double> .Add); DAVectorUtility.StopSubTimer(DAVectorUtility.MPIREDUCETiming1); } if (TotalNumberofPoints < 0.5) { return; } for (int ArrayLoop = 0; ArrayLoop < ArraySize; ArrayLoop++) { Totalmean[ArrayLoop] = Totalmean[ArrayLoop] / TotalNumberofPoints; } }
} // End SALSAFullPrint public static void SALSASyncPrint(int PrintOption, string GlobalStufftoPrint, string StufftoPrint) { if (DebugPrintOption < PrintOption) { return; } string TotalStufftoPrint = ""; if (StufftoPrint.Length > 0) { TotalStufftoPrint = " Node:" + MPI_Rank.ToString() + " " + StufftoPrint; } DAVectorUtility.StartSubTimer(DAVectorUtility.MPIREDUCETiming3); TotalStufftoPrint = DAVectorUtility.MPI_communicator.Allreduce <string>(TotalStufftoPrint, Operation <string> .Add); DAVectorUtility.StopSubTimer(DAVectorUtility.MPIREDUCETiming3); TotalStufftoPrint = GlobalStufftoPrint + TotalStufftoPrint; if (MPI_Rank != 0) { return; } CosmicOutput.Add(TotalStufftoPrint); if (ConsoleDebugOutput) { Console.WriteLine(TotalStufftoPrint); } return; } // End SALSASyncPrint
} // End synchronizeboolean(double cosmicdouble) public static void SynchronizeMPIvariable(ref int cosmicint) { if (DAVectorUtility.MPI_Size > 1) { DAVectorUtility.StartSubTimer(DAVectorUtility.MPISynchTiming); DAVectorUtility.MPI_communicator.Broadcast <int>(ref cosmicint, 0); DAVectorUtility.StopSubTimer(DAVectorUtility.MPISynchTiming); } return; } // End synchronizeboolean(int cosmicint)
public void sumoverthreadsandmpi() { DAVectorUtility.StartSubTimer(DAVectorUtility.ThreadTiming); Parallel.For(0, Program.ParallelOptions.MaxDegreeOfParallelism, Program.ParallelOptions, (ArrayThreadNo) => { int beginindex = ParallelArrayRanges[ArrayThreadNo].StartIndex; int indexlength = ParallelArrayRanges[ArrayThreadNo].Length; for (int ArrayLoop = beginindex; ArrayLoop < beginindex + indexlength; ArrayLoop++) { double tmp = 0.0; for (int ThreadNo = 0; ThreadNo < NumberofThreads; ThreadNo++) { tmp += VectorSum[ThreadNo][ArrayLoop]; } TotalVectorSum[ArrayLoop] = tmp; } }); DAVectorUtility.StopSubTimer(DAVectorUtility.ThreadTiming); if (DAVectorUtility.MPI_Size > 1) { DAVectorUtility.StartSubTimer(DAVectorUtility.MPIREDUCETiming1); TotalNumberofPoints = DAVectorUtility.MPI_communicator.Allreduce <double>(TotalNumberofPoints, Operation <double> .Add); int bigsize = TotalVectorSum.Length; if (bigsize <= 4096) { TotalVectorSum = DAVectorUtility.MPI_communicator.Allreduce <double>(TotalVectorSum, Operation <double> .Add); } else { double[] buffer = new double[4096]; int start = 0; while (start < bigsize) { int whatsleft = Math.Min(bigsize - start, 4096); for (int innerloop = 0; innerloop < whatsleft; innerloop++) { buffer[innerloop] = TotalVectorSum[start + innerloop]; } buffer = DAVectorUtility.MPI_communicator.Allreduce <double>(buffer, Operation <double> .Add); for (int innerloop = 0; innerloop < whatsleft; innerloop++) { TotalVectorSum[start + innerloop] = buffer[innerloop]; } start += whatsleft; } } DAVectorUtility.StopSubTimer(DAVectorUtility.MPIREDUCETiming1); } }
public void sumoverthreadsandmpi() { for (int ThreadNo = 0; ThreadNo < NumberofThreads; ThreadNo++) { TotalNumberofPoints += NumberofPoints[ThreadNo]; Total += TotalinThread[ThreadNo]; } if (DAVectorUtility.MPI_Size > 1) { DAVectorUtility.StartSubTimer(DAVectorUtility.MPIREDUCETiming1); TotalNumberofPoints = DAVectorUtility.MPI_communicator.Allreduce <double>(TotalNumberofPoints, Operation <double> .Add); Total = DAVectorUtility.MPI_communicator.Allreduce <double>(Total, Operation <double> .Add); DAVectorUtility.StopSubTimer(DAVectorUtility.MPIREDUCETiming1); } }
public void sumoverthreadsandmpi() { for (int ThreadNo = 0; ThreadNo < NumberofThreads; ThreadNo++) { TotalNumberofPoints += NumberofPoints[ThreadNo]; TotalOr = Orvalue[ThreadNo] || TotalOr; } if (DAVectorUtility.MPI_Size > 1) { DAVectorUtility.StartSubTimer(DAVectorUtility.MPIREDUCETiming1); TotalNumberofPoints = DAVectorUtility.MPI_communicator.Allreduce <double>(TotalNumberofPoints, Operation <double> .Add); TotalOr = DAVectorUtility.MPI_communicator.Allreduce <bool>(TotalOr, Operation <bool> .LogicalOr); DAVectorUtility.StopSubTimer(DAVectorUtility.MPIREDUCETiming1); } return; }
public void sumoverthreadsandmpi() { for (int ThreadNo = 0; ThreadNo < NumberofThreads; ThreadNo++) { TotalNumberofPoints += NumberofPoints[ThreadNo]; TotalMax = Math.Max(TotalMax, Maxvalue[ThreadNo]); } if (DAVectorUtility.MPI_Size > 1) { DAVectorUtility.StartSubTimer(DAVectorUtility.MPIREDUCETiming1); TotalNumberofPoints = DAVectorUtility.MPI_communicator.Allreduce <double>(TotalNumberofPoints, Operation <double> .Add); TotalMax = DAVectorUtility.MPI_communicator.Allreduce <double>(TotalMax, Operation <double> .Max); DAVectorUtility.StopSubTimer(DAVectorUtility.MPIREDUCETiming1); } return; }
public void sumoverthreadsandmpi() { for (int ThreadNo = 0; ThreadNo < NumberofThreads; ThreadNo++) { TotalNumberofPoints += NumberofPoints[ThreadNo]; TotalInt += Intvalue[ThreadNo]; } if (DAVectorUtility.MPI_Size > 1) { DAVectorUtility.StartSubTimer(DAVectorUtility.MPIREDUCETiming1); TotalNumberofPoints = DAVectorUtility.MPI_communicator.Allreduce <int>(TotalNumberofPoints, Operation <int> .Add); TotalInt = DAVectorUtility.MPI_communicator.Allreduce <int>(TotalInt, Operation <int> .Add); DAVectorUtility.StopSubTimer(DAVectorUtility.MPIREDUCETiming1); } return; }
public void sumoverthreadsandmpi() { for (int ThreadNo = 0; ThreadNo < NumberofThreads; ThreadNo++) { TotalNumberofPoints += NumberofPoints[ThreadNo]; for (int ArrayLoop = 0; ArrayLoop < ArraySize; ArrayLoop++) { TotalVectorSum[ArrayLoop] += VectorSum[ThreadNo][ArrayLoop]; } } if (DAVectorUtility.MPI_Size > 1) { DAVectorUtility.StartSubTimer(DAVectorUtility.MPIREDUCETiming1); TotalNumberofPoints = DAVectorUtility.MPI_communicator.Allreduce <int>(TotalNumberofPoints, Operation <int> .Add); TotalVectorSum = DAVectorUtility.MPI_communicator.Allreduce <int>(TotalVectorSum, Operation <int> .Add); DAVectorUtility.StopSubTimer(DAVectorUtility.MPIREDUCETiming1); } }
public void sumoverthreadsandmpi() { for (int ThreadNo = 0; ThreadNo < NumberofThreads; ThreadNo++) { if (IndexValue[ThreadNo] < 0) { continue; } TotalNumberofPoints += NumberofPoints[ThreadNo]; if (MinMaxPointer != 0) { if ((TotalIndexValue >= 0) && (TotalMaxOrMin > MaxOrMinvalue[ThreadNo])) { continue; } } else { if ((TotalIndexValue >= 0) && (TotalMaxOrMin <= MaxOrMinvalue[ThreadNo])) { continue; } } TotalMaxOrMin = MaxOrMinvalue[ThreadNo]; TotalIndexValue = IndexValue[ThreadNo]; } if (DAVectorUtility.MPI_Size > 1) { DAVectorUtility.StartSubTimer(DAVectorUtility.MPIREDUCETiming1); if (MinMaxPointer != 0) { DAVectorUtility.AllReduceMaxWithIndex(ref TotalMaxOrMin, ref TotalIndexValue); } else { DAVectorUtility.AllReduceMinWithIndex(ref TotalMaxOrMin, ref TotalIndexValue); } TotalNumberofPoints = DAVectorUtility.MPI_communicator.Allreduce <double>(TotalNumberofPoints, Operation <double> .Add); DAVectorUtility.StopSubTimer(DAVectorUtility.MPIREDUCETiming1); } return; }
public void sumoverthreadsandmpi() { for (int ThreadNo = 0; ThreadNo < NumberofThreads; ThreadNo++) { TotalNumberofPoints += NumberofPoints[ThreadNo]; for (int loop = 0; loop < NumberinSum; loop++) { TotalSum[loop] += Sum[ThreadNo][loop]; } } if (DAVectorUtility.MPI_Size > 1) { DAVectorUtility.StartSubTimer(DAVectorUtility.MPIREDUCETiming1); TotalNumberofPoints = DAVectorUtility.MPI_communicator.Allreduce <double>(TotalNumberofPoints, Operation <double> .Add); TotalSum = DAVectorUtility.MPI_communicator.Allreduce <double>(TotalSum, Operation <double> .Add); DAVectorUtility.StopSubTimer(DAVectorUtility.MPIREDUCETiming1); } return; }
public void sumoverthreadsandmpi() { for (int ThreadNo = 0; ThreadNo < NumberofThreads; ThreadNo++) { TotalNumberofPoints += NumberofPoints[ThreadNo]; for (int ArrayLoop = 0; ArrayLoop < ArraySize; ArrayLoop++) { TotalVectorMax[ArrayLoop] = Math.Max(TotalVectorMax[ArrayLoop], VectorMax[ThreadNo][ArrayLoop]); } } if (DAVectorUtility.MPI_Size > 1) { DAVectorUtility.StartSubTimer(DAVectorUtility.MPIREDUCETiming1); TotalNumberofPoints = DAVectorUtility.MPI_communicator.Allreduce <double>(TotalNumberofPoints, Operation <double> .Add); TotalVectorMax = DAVectorUtility.MPI_communicator.Allreduce <double>(TotalVectorMax, Operation <double> .Max); DAVectorUtility.StopSubTimer(DAVectorUtility.MPIREDUCETiming1); } }
// Note FinalClusterCount, FinalCreatedIndex and FinalHostSpecification ONLY used in UpdateMode 2 and only set in this case public void PipelineDistributedBroadcast(double[][] InitialDoubleComponents, double[][] FinalDoubleComponents, int[][] InitialIntegerComponents, int[][] FinalIntegerComponents, int[] InitialCreatedIndex, int[] FinalCreatedIndex, int[] InitialHostSpecification, int[] FinalHostSpecification, ref int FinalClusterCount) { FinalClusterCount = 0; if (DAVectorUtility.MPI_Size <= 1) { return; } // Now process distributed clusters // Variables for processing createdindex int NodeStoragePosition = -1; int TransportedStoragePosition = -1; int NodeAccumulationPosition = -1; int ThreadAccumulationPosition = -1; // Place where received data stored int FinalDataLocationIndex = -1; ++Program.NumberPipelineGroups; // Increment calls of this routine int[] DownbySteps = new int[DAVectorUtility.MPI_Size]; int[] UpbySteps = new int[DAVectorUtility.MPI_Size]; int[] DownbyStepsTotal = new int[DAVectorUtility.MPI_Size]; int[] UpbyStepsTotal = new int[2 * DAVectorUtility.MPI_Size]; for (int PipelineSteps = 0; PipelineSteps < DAVectorUtility.MPI_Size; PipelineSteps++) { DownbySteps[PipelineSteps] = 0; UpbySteps[PipelineSteps] = 0; } // Set NumberUp and NumberDown for (int ClusterIndirectIndex = 0; ClusterIndirectIndex < InitialArraySize; ClusterIndirectIndex++) { int PackedHost = InitialHostSpecification[ClusterIndirectIndex]; for (int PipelineSteps = 1; PipelineSteps < DAVectorUtility.MPI_Size; PipelineSteps++) { if (HostRangeProcessing) { int H1 = PackedHost >> ClusteringSolution.PACKINGSHIFT; int H2 = H1 >> ClusteringSolution.PACKINGSHIFT; H1 = H1 & ClusteringSolution.PACKINGMASK; if (H2 > (DAVectorUtility.MPI_Rank + PipelineSteps - 1)) { ++UpbySteps[PipelineSteps]; } if (H1 < (DAVectorUtility.MPI_Rank - PipelineSteps + 1)) { ++DownbySteps[PipelineSteps]; } } else { int H = PackedHost & ClusteringSolution.PACKINGMASK; if (H > (DAVectorUtility.MPI_Rank + PipelineSteps - 1)) { ++UpbySteps[PipelineSteps]; } if (H < (DAVectorUtility.MPI_Rank - PipelineSteps + 1)) { ++DownbySteps[PipelineSteps]; } } } } for (int PipelineSteps = 0; PipelineSteps < DAVectorUtility.MPI_Size; PipelineSteps++) { UpbyStepsTotal[PipelineSteps] = UpbySteps[PipelineSteps]; UpbyStepsTotal[PipelineSteps + DAVectorUtility.MPI_Size] = DownbySteps[PipelineSteps]; } DAVectorUtility.StartSubTimer(DAVectorUtility.MPIREDUCETiming4); UpbyStepsTotal = DAVectorUtility.MPI_communicator.Allreduce <int>(UpbyStepsTotal, Operation <int> .Add); DAVectorUtility.StopSubTimer(DAVectorUtility.MPIREDUCETiming4); for (int PipelineSteps = 0; PipelineSteps < DAVectorUtility.MPI_Size; PipelineSteps++) { DownbyStepsTotal[PipelineSteps] = UpbyStepsTotal[PipelineSteps + DAVectorUtility.MPI_Size]; } // Variables Used for Up and Down Sections bool Initialstep; int CurrentNode = DAVectorUtility.MPI_Rank; int ReceivedTotal = 0; int NumberClustertoSend = 0; int NumberDoubletoSend = 0; int NumberIntegertoSend = 0; int IsItOK; // Process Clusters going Up the Chain Initialstep = true; int LocalTotal = UpbySteps[1];; int StepsUp = 0; while (true) { // Decide if ANY node needs to communicate Up ++StepsUp; if (StepsUp >= DAVectorUtility.MPI_Size) { break; } int JobTotal = UpbyStepsTotal[StepsUp]; if (JobTotal == 0) { break; } // Some Nodes want to go up the line int SourceProc = MPI.Intercommunicator.Null; int DestProc = MPI.Intercommunicator.Null; int SourceTag = 0; // Random Number int DestTag = 0; SourceProc = DAVectorUtility.MPI_Size - 1; DestProc = 0; if (CurrentNode != 0) { SourceProc = CurrentNode - 1; } if (CurrentNode != (DAVectorUtility.MPI_Size - 1)) { DestProc = CurrentNode + 1; } else { LocalTotal = 0; } MPITransportComponentPacket SendBuffer = new MPITransportComponentPacket(LocalTotal, NumberofDoubleComponents, NumberofIntegerComponents); // Sent Buffer is EXACT size NumberClustertoSend = 0; NumberDoubletoSend = 0; NumberIntegertoSend = 0; if (LocalTotal > 0) { // If no data here, just send dummy packet if (Initialstep) { // Construct message to send from Initial Arrays for (int ClusterSendPointer = 0; ClusterSendPointer < InitialArraySize; ClusterSendPointer++) { int PackedHost = InitialHostSpecification[ClusterSendPointer]; if (HostRangeProcessing) { int H2 = PackedHost >> (2 * ClusteringSolution.PACKINGSHIFT); if (H2 <= DAVectorUtility.MPI_Rank) { continue; } } else { int H = PackedHost & ClusteringSolution.PACKINGMASK; if (H <= DAVectorUtility.MPI_Rank) { continue; } } SendBuffer.AssociatedCreatedIndex[NumberClustertoSend] = InitialCreatedIndex[ClusterSendPointer]; SendBuffer.ClusterHostRange[NumberClustertoSend] = InitialHostSpecification[ClusterSendPointer]; if (NumberofDoubleComponents > 0) { for (int ComponentIndex = 0; ComponentIndex < NumberofDoubleComponents; ComponentIndex++) { SendBuffer.ClusterDoubleComponents[NumberDoubletoSend] = InitialDoubleComponents[ClusterSendPointer][ComponentIndex]; ++NumberDoubletoSend; } } if (NumberofIntegerComponents > 0) { for (int ComponentIndex = 0; ComponentIndex < NumberofIntegerComponents; ComponentIndex++) { SendBuffer.ClusterIntegerComponents[NumberIntegertoSend] = InitialIntegerComponents[ClusterSendPointer][ComponentIndex]; ++NumberIntegertoSend; } } ++NumberClustertoSend; if (NumberClustertoSend >= LocalTotal) { break; } } } else { // Construct message to send from en passant data for (int ReceivedClusterIndex = 0; ReceivedClusterIndex < ReceivedTotal; ReceivedClusterIndex++) { int PackedHost = TransportComponent.ClusterHostRange[ReceivedClusterIndex]; if (HostRangeProcessing) { int H2 = PackedHost >> (2 * ClusteringSolution.PACKINGSHIFT); if (H2 <= DAVectorUtility.MPI_Rank) { continue; } } else { int H = PackedHost & ClusteringSolution.PACKINGMASK; if (H <= DAVectorUtility.MPI_Rank) { continue; } } SendBuffer.AssociatedCreatedIndex[NumberClustertoSend] = TransportComponent.AssociatedCreatedIndex[ReceivedClusterIndex]; SendBuffer.ClusterHostRange[NumberClustertoSend] = TransportComponent.ClusterHostRange[ReceivedClusterIndex]; if (NumberofDoubleComponents > 0) { int OverallIndex = NumberofDoubleComponents * ReceivedClusterIndex; for (int ComponentIndex = 0; ComponentIndex < NumberofDoubleComponents; ComponentIndex++) { SendBuffer.ClusterDoubleComponents[NumberDoubletoSend] = TransportComponent.ClusterDoubleComponents[OverallIndex]; ++NumberDoubletoSend; ++OverallIndex; } } if (NumberofIntegerComponents > 0) { int OverallIndex = NumberofIntegerComponents * ReceivedClusterIndex; for (int ComponentIndex = 0; ComponentIndex < NumberofIntegerComponents; ComponentIndex++) { SendBuffer.ClusterIntegerComponents[NumberIntegertoSend] = TransportComponent.ClusterIntegerComponents[OverallIndex]; ++NumberIntegertoSend; ++OverallIndex; } } ++NumberClustertoSend; if (NumberClustertoSend >= LocalTotal) { break; } } } } // End Case where there is Local Data to Send // Send data in a pipeline forward DAVectorUtility.StartSubTimer(DAVectorUtility.MPISENDRECEIVETiming); DAVectorUtility.MPI_communicator.SendReceive <MPITransportComponentPacket>(SendBuffer, DestProc, DestTag, SourceProc, SourceTag, out TransportComponent, out DistributedClusteringSolution.MPISecStatus); DAVectorUtility.StopSubTimer(DAVectorUtility.MPISENDRECEIVETiming); ++Program.NumberPipelineSteps; Program.NumberofPipelineClusters += SendBuffer.NumberofClusters; // Examine Data passed from lower ranked processor // Set new LocalTotal and Store Data ReceivedTotal = TransportComponent.NumberofClusters; Program.ActualMaxMPITransportBuffer = Math.Max(Program.ActualMaxMPITransportBuffer, ReceivedTotal); LocalTotal = 0; // Count Number of Clusters on next step if (NumberofDoubleComponents != TransportComponent.NumberofDoubleComponents) { Exception e = DAVectorUtility.SALSAError(" Double Components Inconsistent " + NumberofDoubleComponents.ToString() + " " + TransportComponent.NumberofDoubleComponents.ToString() + " in Rank " + DAVectorUtility.MPI_Rank.ToString() + " Bad"); throw (e); } if (NumberofIntegerComponents != TransportComponent.NumberofIntegerComponents) { Exception e = DAVectorUtility.SALSAError(" Integer Components Inconsistent " + NumberofIntegerComponents.ToString() + " " + TransportComponent.NumberofIntegerComponents.ToString() + " in Rank " + DAVectorUtility.MPI_Rank.ToString() + " Bad"); throw (e); } if (ReceivedTotal > 0) { for (int ReceivedClusterIndex = 0; ReceivedClusterIndex < ReceivedTotal; ReceivedClusterIndex++) { int PackedHost = TransportComponent.ClusterHostRange[ReceivedClusterIndex]; if (HostRangeProcessing) { int H2 = PackedHost >> (2 * ClusteringSolution.PACKINGSHIFT); if (H2 < DAVectorUtility.MPI_Rank) { Exception e = DAVectorUtility.SALSAError(" Transported host " + PackedHost.ToString() + " in Rank " + DAVectorUtility.MPI_Rank.ToString() + " Bad Up Range"); throw (e); } if (H2 > DAVectorUtility.MPI_Rank) { ++LocalTotal; } } else { int H = PackedHost & ClusteringSolution.PACKINGMASK; if (H < DAVectorUtility.MPI_Rank) { Exception e = DAVectorUtility.SALSAError(" Transported host " + PackedHost.ToString() + " in Rank " + DAVectorUtility.MPI_Rank.ToString() + " Bad Up Host"); throw (e); } if (H > DAVectorUtility.MPI_Rank) { ++LocalTotal; continue; } } int host = PackedHost & ClusteringSolution.PACKINGMASK; int CreatedIndex = TransportComponent.AssociatedCreatedIndex[ReceivedClusterIndex]; if (UpdateMode < 2) { FinalDataLocationIndex = -1; IsItOK = DistributedClusteringSolution.IndicesperCluster(CreatedIndex, -1, ref NodeStoragePosition, ref TransportedStoragePosition, ref NodeAccumulationPosition, ref ThreadAccumulationPosition); if (StorageMode == 1) { FinalDataLocationIndex = NodeStoragePosition; } if (StorageMode == 2) { FinalDataLocationIndex = TransportedStoragePosition; } if (StorageMode == 3) { FinalDataLocationIndex = NodeAccumulationPosition; } if ((host == DAVectorUtility.MPI_Rank) && (IsItOK != 0)) { Exception e = DAVectorUtility.SALSAError(" Transported Created Index " + CreatedIndex.ToString() + " in Rank " + DAVectorUtility.MPI_Rank.ToString() + " Bad with code " + IsItOK.ToString() + " host " + host.ToString() + " Update mode " + UpdateMode.ToString()); throw (e); } } else { // UpdateMode 2 IsItOK = 0; FinalDataLocationIndex = FinalClusterCount; ++FinalClusterCount; FinalCreatedIndex[FinalDataLocationIndex] = CreatedIndex; FinalHostSpecification[FinalDataLocationIndex] = PackedHost; } if (IsItOK >= 0) { if (FinalDataLocationIndex == -1) { Exception e = DAVectorUtility.SALSAError(" Transported Created Index " + CreatedIndex.ToString() + " in Rank " + DAVectorUtility.MPI_Rank.ToString() + " Bad with Storage Mode " + StorageMode.ToString() + " host " + host.ToString() + " Update mode " + UpdateMode.ToString()); throw (e); } if (NumberofDoubleComponents > 0) { string message = ""; int OverallDoubleIndex = NumberofDoubleComponents * ReceivedClusterIndex; for (int ComponentIndex = 0; ComponentIndex < NumberofDoubleComponents; ComponentIndex++) { if ((UpdateMode == 0) || (UpdateMode == 2)) { FinalDoubleComponents[FinalDataLocationIndex][ComponentIndex] = TransportComponent.ClusterDoubleComponents[OverallDoubleIndex]; } if (UpdateMode == 1) { FinalDoubleComponents[FinalDataLocationIndex][ComponentIndex] += TransportComponent.ClusterDoubleComponents[OverallDoubleIndex]; } message += " * " + TransportComponent.ClusterDoubleComponents[OverallDoubleIndex].ToString("E3") + " " + FinalDoubleComponents[FinalDataLocationIndex][ComponentIndex].ToString("F3"); ++OverallDoubleIndex; } /* if (CreatedIndex == 901) * DAVectorUtility.SALSAFullPrint(1, "Up901 Transport " + UpdateMode.ToString() + " " + FinalDataLocationIndex.ToString() + message); */ } if (NumberofIntegerComponents > 0) { int OverallIntegerIndex = NumberofIntegerComponents * ReceivedClusterIndex; for (int ComponentIndex = 0; ComponentIndex < NumberofIntegerComponents; ComponentIndex++) { if ((UpdateMode == 0) || (UpdateMode == 2)) { FinalIntegerComponents[FinalDataLocationIndex][ComponentIndex] = TransportComponent.ClusterIntegerComponents[OverallIntegerIndex]; } if (UpdateMode == 1) { FinalIntegerComponents[FinalDataLocationIndex][ComponentIndex] += TransportComponent.ClusterIntegerComponents[OverallIntegerIndex]; } ++OverallIntegerIndex; } } } // End case where location found IsItOK >= 0 } // end Loop over ReceivedClusterIndex } // End case where ReceivedTotal > 0 Initialstep = false; } // End While over MPI pipeline steps for pipeline going UP the chain // Process Clusters going Down the Chain Initialstep = true; int StepsDown = 0; LocalTotal = DownbySteps[1]; while (true) { StepsDown++; if (StepsDown >= DAVectorUtility.MPI_Size) { break; } int JobTotal = DownbyStepsTotal[StepsDown]; if (JobTotal == 0) { break; } // Some Nodes want to go down the line int SourceProc = MPI.Intercommunicator.Null; int DestProc = MPI.Intercommunicator.Null; DestProc = DAVectorUtility.MPI_Size - 1; SourceProc = 0; int SourceTag = 22; // Random Number int DestTag = 22; if (CurrentNode != 0) { DestProc = CurrentNode - 1; } else { LocalTotal = 0; } if (CurrentNode != (DAVectorUtility.MPI_Size - 1)) { SourceProc = CurrentNode + 1; } MPITransportComponentPacket SendBuffer = new MPITransportComponentPacket(LocalTotal, NumberofDoubleComponents, NumberofIntegerComponents); // Sent Buffer is EXACT size NumberClustertoSend = 0; NumberDoubletoSend = 0; NumberIntegertoSend = 0; if (LocalTotal > 0) { // If no data here, just send dummy packet if (Initialstep) { // Construct message to send from local accumulation arrays for (int ClusterIndirectIndex = 0; ClusterIndirectIndex < InitialArraySize; ClusterIndirectIndex++) { int PackedHost = InitialHostSpecification[ClusterIndirectIndex]; if (HostRangeProcessing) { int H1 = (PackedHost >> ClusteringSolution.PACKINGSHIFT) & ClusteringSolution.PACKINGMASK; if (H1 >= DAVectorUtility.MPI_Rank) { continue; } } else { int H = PackedHost & ClusteringSolution.PACKINGMASK; if (H >= DAVectorUtility.MPI_Rank) { continue; } } SendBuffer.AssociatedCreatedIndex[NumberClustertoSend] = InitialCreatedIndex[ClusterIndirectIndex]; SendBuffer.ClusterHostRange[NumberClustertoSend] = InitialHostSpecification[ClusterIndirectIndex]; if (NumberofDoubleComponents > 0) { for (int ComponentIndex = 0; ComponentIndex < NumberofDoubleComponents; ComponentIndex++) { SendBuffer.ClusterDoubleComponents[NumberDoubletoSend] = InitialDoubleComponents[ClusterIndirectIndex][ComponentIndex]; ++NumberDoubletoSend; } } if (NumberofIntegerComponents > 0) { for (int ComponentIndex = 0; ComponentIndex < NumberofIntegerComponents; ComponentIndex++) { SendBuffer.ClusterIntegerComponents[NumberIntegertoSend] = InitialIntegerComponents[ClusterIndirectIndex][ComponentIndex]; ++NumberIntegertoSend; } } ++NumberClustertoSend; if (NumberClustertoSend >= LocalTotal) { break; } } } else { // Construct message to send from en passant data for (int ReceivedClusterIndex = 0; ReceivedClusterIndex < ReceivedTotal; ReceivedClusterIndex++) { int PackedHost = TransportComponent.ClusterHostRange[ReceivedClusterIndex]; if (HostRangeProcessing) { int H1 = (PackedHost >> ClusteringSolution.PACKINGSHIFT) & ClusteringSolution.PACKINGMASK; if (H1 >= DAVectorUtility.MPI_Rank) { continue; } } else { int H = PackedHost & ClusteringSolution.PACKINGMASK; if (H >= DAVectorUtility.MPI_Rank) { continue; } } SendBuffer.AssociatedCreatedIndex[NumberClustertoSend] = TransportComponent.AssociatedCreatedIndex[ReceivedClusterIndex]; SendBuffer.ClusterHostRange[NumberClustertoSend] = TransportComponent.ClusterHostRange[ReceivedClusterIndex]; if (NumberofDoubleComponents > 0) { int OverallIndex = NumberofDoubleComponents * ReceivedClusterIndex; for (int ComponentIndex = 0; ComponentIndex < NumberofDoubleComponents; ComponentIndex++) { SendBuffer.ClusterDoubleComponents[NumberDoubletoSend] = TransportComponent.ClusterDoubleComponents[OverallIndex]; ++NumberDoubletoSend; ++OverallIndex; } } if (NumberofIntegerComponents > 0) { int OverallIndex = NumberofIntegerComponents * ReceivedClusterIndex; for (int ComponentIndex = 0; ComponentIndex < NumberofIntegerComponents; ComponentIndex++) { SendBuffer.ClusterIntegerComponents[NumberIntegertoSend] = TransportComponent.ClusterIntegerComponents[OverallIndex]; ++NumberIntegertoSend; ++OverallIndex; } } ++NumberClustertoSend; if (NumberClustertoSend >= LocalTotal) { break; } } } // end en passant data is source of information } // End Case where there is Local Data to Send // Send data in a pipeline backwards DAVectorUtility.StartSubTimer(DAVectorUtility.MPISENDRECEIVETiming); DAVectorUtility.MPI_communicator.SendReceive <MPITransportComponentPacket>(SendBuffer, DestProc, DestTag, SourceProc, SourceTag, out TransportComponent, out DistributedClusteringSolution.MPISecStatus);; DAVectorUtility.StopSubTimer(DAVectorUtility.MPISENDRECEIVETiming); ++Program.NumberPipelineSteps; Program.NumberofPipelineClusters += SendBuffer.NumberofClusters; // Examine Data passed from higher ranked processor ReceivedTotal = TransportComponent.NumberofClusters; Program.ActualMaxMPITransportBuffer = Math.Max(Program.ActualMaxMPITransportBuffer, ReceivedTotal); LocalTotal = 0; if (NumberofDoubleComponents != TransportComponent.NumberofDoubleComponents) { Exception e = DAVectorUtility.SALSAError(" Double Components Inconsistent " + NumberofDoubleComponents.ToString() + " " + TransportComponent.NumberofDoubleComponents.ToString() + " in Rank " + DAVectorUtility.MPI_Rank.ToString() + " Bad"); throw (e); } if (NumberofIntegerComponents != TransportComponent.NumberofIntegerComponents) { Exception e = DAVectorUtility.SALSAError(" Integer Components Inconsistent " + NumberofIntegerComponents.ToString() + " " + TransportComponent.NumberofIntegerComponents.ToString() + " in Rank " + DAVectorUtility.MPI_Rank.ToString() + " Bad"); throw (e); } if (ReceivedTotal > 0) { for (int ReceivedClusterIndex = 0; ReceivedClusterIndex < ReceivedTotal; ReceivedClusterIndex++) { int PackedHost = TransportComponent.ClusterHostRange[ReceivedClusterIndex]; if (HostRangeProcessing) { int H1 = (PackedHost >> ClusteringSolution.PACKINGSHIFT) & ClusteringSolution.PACKINGMASK; if (H1 > DAVectorUtility.MPI_Rank) { Exception e = DAVectorUtility.SALSAError(" Transported host " + PackedHost.ToString() + " in Rank " + DAVectorUtility.MPI_Rank.ToString() + " Bad Down Range"); throw (e); } if (H1 < DAVectorUtility.MPI_Rank) { ++LocalTotal; } } else { int H = PackedHost & ClusteringSolution.PACKINGMASK; if (H > DAVectorUtility.MPI_Rank) { Exception e = DAVectorUtility.SALSAError(" Transported host " + PackedHost.ToString() + " in Rank " + DAVectorUtility.MPI_Rank.ToString() + " Bad Down Not Range"); throw (e); } if (H < DAVectorUtility.MPI_Rank) { ++LocalTotal; continue; } } int host = PackedHost & ClusteringSolution.PACKINGMASK; int CreatedIndex = TransportComponent.AssociatedCreatedIndex[ReceivedClusterIndex]; if (UpdateMode < 2) { FinalDataLocationIndex = -1; IsItOK = DistributedClusteringSolution.IndicesperCluster(CreatedIndex, -1, ref NodeStoragePosition, ref TransportedStoragePosition, ref NodeAccumulationPosition, ref ThreadAccumulationPosition); if (StorageMode == 1) { FinalDataLocationIndex = NodeStoragePosition; } if (StorageMode == 2) { FinalDataLocationIndex = TransportedStoragePosition; } if (StorageMode == 3) { FinalDataLocationIndex = NodeAccumulationPosition; } if ((host == DAVectorUtility.MPI_Rank) && (IsItOK != 0)) { Exception e = DAVectorUtility.SALSAError(" Transported Created Index " + CreatedIndex.ToString() + " in Rank " + DAVectorUtility.MPI_Rank.ToString() + " Bad with code " + IsItOK.ToString() + " host " + host.ToString() + " Update mode " + UpdateMode.ToString()); throw (e); } } else { IsItOK = 0; FinalDataLocationIndex = FinalClusterCount; ++FinalClusterCount; FinalCreatedIndex[FinalDataLocationIndex] = CreatedIndex; FinalHostSpecification[FinalDataLocationIndex] = PackedHost; } if (IsItOK >= 0) { if (FinalDataLocationIndex == -1) { Exception e = DAVectorUtility.SALSAError(" Transported Created Index " + CreatedIndex.ToString() + " in Rank " + DAVectorUtility.MPI_Rank.ToString() + " Bad with Storage Mode " + StorageMode.ToString() + " host " + host.ToString() + " Update mode " + UpdateMode.ToString()); throw (e); } if (NumberofDoubleComponents > 0) { string message = ""; int OverallDoubleIndex = NumberofDoubleComponents * ReceivedClusterIndex; for (int ComponentIndex = 0; ComponentIndex < NumberofDoubleComponents; ComponentIndex++) { if ((UpdateMode == 0) || (UpdateMode == 2)) { FinalDoubleComponents[FinalDataLocationIndex][ComponentIndex] = TransportComponent.ClusterDoubleComponents[OverallDoubleIndex]; } if (UpdateMode == 1) { FinalDoubleComponents[FinalDataLocationIndex][ComponentIndex] += TransportComponent.ClusterDoubleComponents[OverallDoubleIndex]; } message += " * " + TransportComponent.ClusterDoubleComponents[OverallDoubleIndex].ToString("E3") + " " + FinalDoubleComponents[FinalDataLocationIndex][ComponentIndex].ToString("F3"); ++OverallDoubleIndex; } /* if (CreatedIndex == 901) * DAVectorUtility.SALSAFullPrint(1, "Dn901 Transport " + UpdateMode.ToString() + " " + FinalDataLocationIndex.ToString() + message); */ } if (NumberofIntegerComponents > 0) { int OverallIntegerIndex = NumberofIntegerComponents * ReceivedClusterIndex; for (int ComponentIndex = 0; ComponentIndex < NumberofIntegerComponents; ComponentIndex++) { if ((UpdateMode == 0) || (UpdateMode == 2)) { FinalIntegerComponents[FinalDataLocationIndex][ComponentIndex] = TransportComponent.ClusterIntegerComponents[OverallIntegerIndex]; } if (UpdateMode == 1) { FinalIntegerComponents[FinalDataLocationIndex][ComponentIndex] += TransportComponent.ClusterIntegerComponents[OverallIntegerIndex]; } ++OverallIntegerIndex; } } } } // End Loop over Received Clusters } // End case when data received ReceivedTotal > 0 Initialstep = false; } // End While over MPI pipeline steps going DOWN the chain } // End PipelineDistributedBroadcast
public void sumoverthreadsandmpi() { Parallel.For(0, Program.ParallelOptions.MaxDegreeOfParallelism, Program.ParallelOptions, (AccumulationThreadNo) => {// Sum over Threads int beginindex = DistributedClusteringSolution.ParallelNodeAccumulationRanges[AccumulationThreadNo].StartIndex; int indexlength = DistributedClusteringSolution.ParallelNodeAccumulationRanges[AccumulationThreadNo].Length; for (int NodeAccumulationIndex = beginindex; NodeAccumulationIndex < beginindex + indexlength; NodeAccumulationIndex++) { for (int ThreadNo = 0; ThreadNo < DAVectorUtility.ThreadCount; ThreadNo++) { int IndexforThread = DistributedClusteringSolution.NodeAccMetaData.AccumulationNodetoThreadClusterAssociations[NodeAccumulationIndex][ThreadNo]; if (IndexforThread >= 0) { for (int ComponentIndex = 0; ComponentIndex < NumberDoubleComponents; ComponentIndex++) { TotalVectorSum[NodeAccumulationIndex][ComponentIndex] += VectorSum[ThreadNo][IndexforThread][ComponentIndex]; } } } } }); // End Sum over Threads // Sum over nodes using pipelined transport if (DAVectorUtility.MPI_Size > 1) { // Divide Clusters into 3 types // NodeAccumulationClusterStatus = 2 Locally controlled Distributed Cluster Will be Updated // NodeAccumulationClusterStatus = 0, 1 Global Cluster // NodeAccumulationClusterStatus = 3 Remotely controlled Distributed Cluster // Types 1 and 3 may send data (both) Up and Down // Process Global Clusters -- there are same total in all nodes and must be stored in same order // They do not need to be in same absolute position int LocalTotal = 0; // Number of Global Clusters int NumberNodeAccumulationPoints = DistributedClusteringSolution.NodeAccMetaData.NumberofPointsperNode; for (int NodeAccumulationIndex = 0; NodeAccumulationIndex < NumberNodeAccumulationPoints; NodeAccumulationIndex++) { if (DistributedClusteringSolution.NodeAccMetaData.NodeAccumulationClusterStatus[NodeAccumulationIndex] > 1) { continue; } ++LocalTotal; } if (LocalTotal > 0) { int LocalTotaltimesComponents = LocalTotal * NumberDoubleComponents; double[] GlobalClusterComponent = new double[LocalTotaltimesComponents]; int[] GlobalClusterIndex = new int[LocalTotal]; int NumberGlobal1 = 0; int NumberGlobal2 = 0; for (int NodeAccumulationIndex = 0; NodeAccumulationIndex < NumberNodeAccumulationPoints; NodeAccumulationIndex++) { int LocalStatus = DistributedClusteringSolution.NodeAccMetaData.NodeAccumulationClusterStatus[NodeAccumulationIndex]; if ((LocalStatus < 0) || (LocalStatus > 1)) { continue; } for (int ComponentIndex = 0; ComponentIndex < NumberDoubleComponents; ComponentIndex++) { GlobalClusterComponent[NumberGlobal2] = TotalVectorSum[NodeAccumulationIndex][ComponentIndex]; ++NumberGlobal2; } GlobalClusterIndex[NumberGlobal1] = NodeAccumulationIndex; ++NumberGlobal1; if (NumberGlobal1 >= LocalTotal) { break; } } DAVectorUtility.StartSubTimer(DAVectorUtility.MPIREDUCETiming6); GlobalClusterComponent = DAVectorUtility.MPI_communicator.Allreduce <double>(GlobalClusterComponent, Operation <double> .Add); DAVectorUtility.StopSubTimer(DAVectorUtility.MPIREDUCETiming6); NumberGlobal2 = 0; for (int LocalIndex = 0; LocalIndex < LocalTotal; LocalIndex++) { for (int ComponentIndex = 0; ComponentIndex < NumberDoubleComponents; ComponentIndex++) { TotalVectorSum[GlobalClusterIndex[LocalIndex]][ComponentIndex] = GlobalClusterComponent[ComponentIndex]; ++NumberGlobal2; } } } // End Case where there are Global Clusters // Distributed Clusters DAVectorUtility.StartSubTimer(DAVectorUtility.MPIDistributedREDUCETiming); int FinalClusterCount = 0; // Dummy DistributedSynchronization.TransportviaPipeline DoDistributedTransfer = new DistributedSynchronization.TransportviaPipeline(1, false, NumberDoubleComponents, 0, NumberNodeAccumulationPoints, 3); DoDistributedTransfer.PipelineDistributedBroadcast(TotalVectorSum, TotalVectorSum, null, null, DistributedClusteringSolution.NodeAccMetaData.NodeAccumulationCreatedIndices, DistributedClusteringSolution.NodeAccMetaData.NodeAccumulationCreatedIndices, DistributedClusteringSolution.NodeAccMetaData.NodeAccumulationClusterHosts, DistributedClusteringSolution.NodeAccMetaData.NodeAccumulationClusterHosts, ref FinalClusterCount); DAVectorUtility.StopSubTimer(DAVectorUtility.MPIDistributedREDUCETiming); } // End Case where MPI needed } // End sumoverthreadsandmpi()
public void sumoverthreadsandmpi() { for (int storeloop = 0; storeloop < Numbertofind; storeloop++) { TotalMinValue[storeloop] = -1.0; TotalIndexValue[storeloop] = -1; } TotalWorst = -1; for (int ThreadNo = 0; ThreadNo < NumberofThreads; ThreadNo++) { for (int storeloop = 0; storeloop < Numbertofind; storeloop++) { if (IndexValuebythread[ThreadNo][storeloop] < 0) { continue; // End this thread } FindMinimumSet(MinValuebythread[ThreadNo][storeloop], IndexValuebythread[ThreadNo][storeloop], ref TotalWorst, TotalMinValue, TotalIndexValue, Numbertofind); } } if (DAVectorUtility.MPI_Size > 1) { DAVectorUtility.StartSubTimer(DAVectorUtility.MPIREDUCETiming1); TotalNumberofPoints = DAVectorUtility.MPI_communicator.Allreduce <double>(TotalNumberofPoints, Operation <double> .Add); DAVectorUtility.StopSubTimer(DAVectorUtility.MPIREDUCETiming1); } // Sort in absolute order and accumulate over processes. This takes Numbertofindsteps for (int OrderLoop = 0; OrderLoop < Numbertofind; OrderLoop++) { int localindex = -1; // unset double localvalue = -1.0; int loopused = -1; for (int internalloop = 0; internalloop < Numbertofind; internalloop++) { // Find minimum if (TotalIndexValue[internalloop] < 0) { continue; } if ((localindex < 0) || (TotalMinValue[internalloop] < localvalue)) { localindex = TotalIndexValue[internalloop]; localvalue = TotalMinValue[internalloop]; loopused = internalloop; } } int oldlocalindex = localindex; if (DAVectorUtility.MPI_Size > 1) { DAVectorUtility.StartSubTimer(DAVectorUtility.MPIREDUCETiming1); DAVectorUtility.AllReduceMinWithIndex(ref localvalue, ref localindex); DAVectorUtility.StopSubTimer(DAVectorUtility.MPIREDUCETiming1); } OrderedMinValue[OrderLoop] = localvalue; OrderedIndexValue[OrderLoop] = localindex; if ((oldlocalindex >= 0) && (OrderedIndexValue[OrderLoop] == oldlocalindex)) { TotalIndexValue[loopused] = -1; TotalMinValue[loopused] = -1.0; } } // Loop over Order Loop return; }